July, 2014

What is OpenCyto?

Not an algorithm, but a framework for automated gating.


  • Easily build reproducible gating pipelines.
  • Use any gating algorithm
    • interchange any algorithm at any step (support gating plugins)
  • Simplify data handling and data management.
    • Easily pass subsets of the data (cell subsets) to different gating algorithms.
  • Simple(r) pipeline template definitions
    • Pipeline defined via text file (csv)
    • Templates and code are re-usable for standardized assays and data.
  • Facilitate comparative analysis
    • Import manually gated data from FlowJo workspaces
  • Scale to large data sets
    • HDF5 support - data sets limited by disk space not RAM.


Raw data ➙ Preprocessing ➙ Annotation ➙ Gating ➙ Statistical analysis ➙ Output

The OpenCyto Gating Framework is a collection of R/BioConductor packages for easily building reproducible flow data analysis pipelines.


Getting Started


Requirements: R + Bioconductor

  • Install release version of R from CRAN.
  • Install release version of BioConductor from bioconductor.org/install
  • Install OpenCyto and its dependencies
    • Within R type the following:
This installs all the required packages.

Still have problems? Bioconductor mailing list
Email: Mike Jiang or Greg Finak
Twitter: @OpenCyto


Getting Started II

Alternately if you are brave and want the latest bug fixes and features - github.com/RGLab


You may use the devtools package to install the latest stable versions directly from github.

A Worked Example

Intracellular Cytokine Staining of Antigen-stimulated T-cells

  • Full data set at Flowrepository.org FR-FCM-ZZ7U
  • Batch 0882, 76 sample files, 13 compensation controls.
ws<-openWorkspace("data/workspace/080 batch 0882.xml")
FlowJo Workspace Version  2.0 
File location:  data/workspace 
File name:  080 batch 0882.xml 
Workspace is open. 
Groups in Workspace
          Name Num.Samples
1  All Samples         158
2   0882-L-080         157
3        Comps          13
4 0882 Samples          76

Import Manual Gating (parseWorkspace)

Create a gating set of manual gates.

gating_set<-parseWorkspace(ws,name="0882 Samples",path="data/FCS/",isNcdf=TRUE)
## loading R object...
## loading tree object...
## Done
Parsing 76 samples
calling c++ parser...

We now have gated, compensated and transformed data in an HDF5 file represented in a GatingSet object. We can save it for later use.

saving ncdf...
saving tree object...
saving R object...
To reload it, use 'load_gs' function

The archived gating set contains all the information on transformation, compensation, single-cell events, and gates and can be shared with collaborators.


Visualizing the Gating Layout (plotGate)

plotGate(gating_set[[1]],xbin=16,gpar=list(ncol=5)) # Binning for faster plotting

Layout of manual gates


Visualizing the Gating Tree (plot)

Calling plot on the gating set gives us a view of the gating tree.


We annotate our gating set from the keywords and flowrepository. We'll keep only the GAG and negative control stimulations

keyword_vars<-c("$FIL","Stim","Sample Order","EXPERIMENT NAME") #relevant keywords
pd<-data.table(getKeywords(gating_set,keyword_vars)) #extract relevant keywords to a data table
annotations<-data.table:::fread("data/workspace/pd_submit.csv") # read the annotations from flowrepository
pd<-data.frame(annotations[pd]) #data.table style merge
pData(gating_set)<-pd #annotate
name Condition VISITNO PTID Sample.Description
769121.fcs negctrl 5 080-17 PBMCs from healthy subjects
769122.fcs negctrl 5 080-17 PBMCs from healthy subjects
769193.fcs GAG-1-PTEG 5 080-17 PBMCs from healthy subjects
769225.fcs POL-1-PTEG 5 080-17 PBMCs from healthy subjects

Clone and save for automated gating

We want to perform automated gating of this data.

  • We'll clone the gating set, delete existing nodes and re-save the data as a new gating set.
## [1] "NHxz3bpHGl.dat"     "NHxz3bpHGl.rds"     "file9e8620253f4.nc"
  • .nc file is the HDF5 file of event-level data..
  • .dat file contains the gating set representation from the C data structure.
  • .rds file is an R data file that contains the R-object information.

Send it to a friend, load_gs() will read it all in and the data will be available.


Costructing a Template - I

alias pop parent dims gating_method gating_args groupBy preprocessing_method
boundary boundary root FSC-A,SSC-A boundary max=c(2.5e5,2.5e5)
singlet singlet boundary FSC-A,FSC-H singletGate prediction_level=0.999,wider_gate=TRUE,subsample_pct=0.2
viable viable- singlet AViD mindensity gate_range=c(500,1000)
nonNeutro nonNeutro- viable SSC-A mindensity gate_range=c(5e4,1.5e5)
DebrisGate DebrisGate+ nonNeutro FSC-A mindensity gate_range=c(0,1e+05)
nonDebris nonDebris+ viable FSC-A refGate DebrisGate
lymph lymph nonDebris FSC-A,SSC-A flowClust K=2,quantile=0.99 prior_flowClust
cd3 cd3+ lymph cd3 mindensity

Costructing a Template - II

Each row defines a cell population

  • alias: how we refer to the population / shorthand
  • pop: The population definition i.e. do we keep the positive (+) or negative (-) cells for a marker / pair of markers after gating.
  • parent: The alias of the parent population on which the current population is defined
  • dims: The dimensions / markers used to define this cell population.
  • gating_method: Which gating algorithm to use.
  • gating_args: additional arguments passed to the gating method to tweak various parameters
  • collaseDataForGating: TRUE or FALSE. Together with groupBy will gate multiple samples with a common gate.
  • groupBy: Specify metadata variables for combining samples (e.g. PTID)
  • preprocessing_method: advanced use for some gating methods
  • preprocessing_args: additional arguments

Constructing a Template - III

We read in the template and visualize it



Automated Gating

openCyto walks through the template and gates each population in each sample using the algoirthm named in the template.

gating(x = gt, y =  auto_gating)
## Some output..