July, 2014

What is OpenCyto?

Not an algorithm, but a framework for automated gating.

Goals

  • Easily build reproducible gating pipelines.
  • Use any gating algorithm
    • interchange any algorithm at any step (support gating plugins)
  • Simplify data handling and data management.
    • Easily pass subsets of the data (cell subsets) to different gating algorithms.
  • Simple(r) pipeline template definitions
    • Pipeline defined via text file (csv)
    • Templates and code are re-usable for standardized assays and data.
  • Facilitate comparative analysis
    • Import manually gated data from FlowJo workspaces
  • Scale to large data sets
    • HDF5 support - data sets limited by disk space not RAM.
www.bioconductor.org

Overview

Raw data ➙ Preprocessing ➙ Annotation ➙ Gating ➙ Statistical analysis ➙ Output

The OpenCyto Gating Framework is a collection of R/BioConductor packages for easily building reproducible flow data analysis pipelines.

www.bioconductor.org

Getting Started

Installation

Requirements: R + Bioconductor

  • Install release version of R from CRAN.
  • Install release version of BioConductor from bioconductor.org/install
  • Install OpenCyto and its dependencies
    • Within R type the following:
require(BiocInstaller)  
biocLite("openCyto")
This installs all the required packages.

Still have problems? Bioconductor mailing list
Email: Mike Jiang or Greg Finak
Twitter: @OpenCyto

www.bioconductor.org

Getting Started II

Alternately if you are brave and want the latest bug fixes and features - github.com/RGLab

require(devtools)
packages<-c("RGLab/flowStats","RGLab/flowCore","RGLab/flowViz","RGLab/ncdfFlow","RGLab/flowWorkspace","RGLab/openCyto")
install_github(packages,quick=TRUE)

You may use the devtools package to install the latest stable versions directly from github.

A Worked Example

Intracellular Cytokine Staining of Antigen-stimulated T-cells

  • Full data set at Flowrepository.org FR-FCM-ZZ7U
  • Batch 0882, 76 sample files, 13 compensation controls.
ws<-openWorkspace("data/workspace/080 batch 0882.xml")
FlowJo Workspace Version  2.0 
File location:  data/workspace 
File name:  080 batch 0882.xml 
Workspace is open. 
Groups in Workspace
          Name Num.Samples
1  All Samples         158
2   0882-L-080         157
3        Comps          13
4 0882 Samples          76
www.bioconductor.org

Import Manual Gating (parseWorkspace)

Create a gating set of manual gates.

gating_set<-parseWorkspace(ws,name="0882 Samples",path="data/FCS/",isNcdf=TRUE)
## loading R object...
## loading tree object...
## Done
Parsing 76 samples
calling c++ parser...
...

We now have gated, compensated and transformed data in an HDF5 file represented in a GatingSet object. We can save it for later use.

save_gs(gating_set,path="data/manual_gating")
saving ncdf...
saving tree object...
saving R object...
Done
To reload it, use 'load_gs' function

The archived gating set contains all the information on transformation, compensation, single-cell events, and gates and can be shared with collaborators.

www.bioconductor.org

Visualizing the Gating Layout (plotGate)

plotGate(gating_set[[1]],xbin=16,gpar=list(ncol=5)) # Binning for faster plotting

Layout of manual gates

www.bioconductor.org

Visualizing the Gating Tree (plot)

Calling plot on the gating set gives us a view of the gating tree.
www.bioconductor.org

Annotation

We annotate our gating set from the keywords and flowrepository. We'll keep only the GAG and negative control stimulations

keyword_vars<-c("$FIL","Stim","Sample Order","EXPERIMENT NAME") #relevant keywords
pd<-data.table(getKeywords(gating_set,keyword_vars)) #extract relevant keywords to a data table
annotations<-data.table:::fread("data/workspace/pd_submit.csv") # read the annotations from flowrepository
pd<-data.frame(annotations[pd]) #data.table style merge
setnames(pd,c("Timepoint","Individual"),c("VISITNO","PTID"))
pData(gating_set)<-pd #annotate
name Condition VISITNO PTID Sample.Description
769121.fcs negctrl 5 080-17 PBMCs from healthy subjects
769122.fcs negctrl 5 080-17 PBMCs from healthy subjects
769193.fcs GAG-1-PTEG 5 080-17 PBMCs from healthy subjects
769225.fcs POL-1-PTEG 5 080-17 PBMCs from healthy subjects
www.bioconductor.org

Clone and save for automated gating

We want to perform automated gating of this data.

  • We'll clone the gating set, delete existing nodes and re-save the data as a new gating set.
auto_gating<-clone(gating_subset)
Rm("S",auto_gating)
save_gs(auto_gating,path="data/autogating",overwrite=TRUE)
list.files("data/autogating")
## [1] "NHxz3bpHGl.dat"     "NHxz3bpHGl.rds"     "file9e8620253f4.nc"
  • .nc file is the HDF5 file of event-level data..
  • .dat file contains the gating set representation from the C data structure.
  • .rds file is an R data file that contains the R-object information.

Send it to a friend, load_gs() will read it all in and the data will be available.

www.bioconductor.org

Costructing a Template - I

alias pop parent dims gating_method gating_args groupBy preprocessing_method
boundary boundary root FSC-A,SSC-A boundary max=c(2.5e5,2.5e5)
singlet singlet boundary FSC-A,FSC-H singletGate prediction_level=0.999,wider_gate=TRUE,subsample_pct=0.2
viable viable- singlet AViD mindensity gate_range=c(500,1000)
nonNeutro nonNeutro- viable SSC-A mindensity gate_range=c(5e4,1.5e5)
DebrisGate DebrisGate+ nonNeutro FSC-A mindensity gate_range=c(0,1e+05)
nonDebris nonDebris+ viable FSC-A refGate DebrisGate
lymph lymph nonDebris FSC-A,SSC-A flowClust K=2,quantile=0.99 prior_flowClust
cd3 cd3+ lymph cd3 mindensity
www.bioconductor.org

Costructing a Template - II

Each row defines a cell population

  • alias: how we refer to the population / shorthand
  • pop: The population definition i.e. do we keep the positive (+) or negative (-) cells for a marker / pair of markers after gating.
  • parent: The alias of the parent population on which the current population is defined
  • dims: The dimensions / markers used to define this cell population.
  • gating_method: Which gating algorithm to use.
  • gating_args: additional arguments passed to the gating method to tweak various parameters
  • collaseDataForGating: TRUE or FALSE. Together with groupBy will gate multiple samples with a common gate.
  • groupBy: Specify metadata variables for combining samples (e.g. PTID)
  • preprocessing_method: advanced use for some gating methods
  • preprocessing_args: additional arguments
www.bioconductor.org

Constructing a Template - III

We read in the template and visualize it

gt<-gatingTemplate("data/template/gt_080.csv")
plot(gt)

www.bioconductor.org

Automated Gating

openCyto walks through the template and gates each population in each sample using the algoirthm named in the template.

gating(x = gt, y =  auto_gating)
## Some output..
plot(auto_gating)