--- title: "OpenCyto Practical Workshop" author: "Greg Finak" date: "7/10/2014" output: html_document --- The common data files (FCS files for the practical workshop) are located in `/data/OpenCyto`. The following will move the `gating template csv` file to your home directory under `~/data`. Please do not touch or modify the shared gating template. Work rather on your own copy. ```{r,eval=TRUE,echo=TRUE} # Common files FCSPATH="/data/OpenCyto/data/FCS/" WORKSPACE="/data/OpenCyto/data/workspace/080 batch 0882.xml" OUTPATH="~" MANUAL<-file.path(OUTPATH,"data","manual") AUTO<-file.path(OUTPATH,"data","auto") TEMPLATE="/data/OpenCyto/data/template/gt_080.csv" ANNOTATIONS="/data/OpenCyto/data/workspace/pd_submit.csv" # Move the template into your home directory if(!file.exists("~/data")){ dir.create("~/data") } file.copy(TEMPLATE,file.path("~/data",basename(TEMPLATE)),overwrite = TRUE) file.copy(ANNOTATIONS,file.path("~/data",basename(ANNOTATIONS)),overwrite = TRUE) TEMPLATE<-"~/data/gt_080.csv" ANNOTATIONS<-"~/data/pd_submit.csv" ``` ### Load libraries ```{r eval=TRUE} require(openCyto) require(ggplot2) require(plyr) require(reshape2) require(data.table) ``` ### Open the workspace The first thing to do is open the workspace file. ```{r eval=TRUE, echo=TRUE} ws<-openWorkspace(WORKSPACE) show(ws) ``` We see different sample groups in the workspace. The data we're concerned with is the `0882 Samples` group. The data are already compensated and transformed. ### Parse the workspace data The following code snippet parses the contents of the workspace, loads the FCS data, compensates and transforms it according to the instructions in the FlowJo workspace file, and stores the results in a gating_set. This can take some time depending on the bandwidth of the attached storage device (local vs network disk). **Some arguments to parseWorkspace** The `name` argument specifies which `group` to import. The `path` argument tells `flowWorkspace` where to look for the FCS files. The `subset` argument allows you to parse a subset of samples. It is a numeric index. The `isNcdf` argument allows you specify whether or not to use HDF5 / NetCDF for disk-based data access (TRUE), or whether to store the data in RAM (FALSE). Large data sets can't practically be analyzed in RAM, set this to TRUE most of the time. ```{r eval=TRUE, echo=TRUE,message=FALSE,results='hide',warning=FALSE,error=FALSE} # Parse the worspace only if we haven't already done so, and save it. if(!file.exists(MANUAL)){ G<-parseWorkspace(ws,isNcdf=TRUE,name="0882 Samples",path=FCSPATH) save_gs(G,MANUAL) }else{ G<-load_gs(MANUAL) } ``` G is now a `GatingSet` object. This is a set of `GatingHierarchy` objects, each of which represents an FCS file in the workspace / group that you imported. ### Explore the GatingSet object ```{r eval=TRUE,echo=TRUE} show(G) ``` We see the object `G` has 76 samples. The usual R operators can be used to subset the data. - Subset by index (not preferred) - Extract sample names - Subset by sample names (preferred) ```{r echo=TRUE, eval=TRUE} G[1:10] # Subset by numeric index.. not the preferred way of doing things. sampleNames(G) # Get the sample names G[sampleNames(G)[1:10]] # Subset by sample name.. preferred. ``` #### Visualize transformations There is one data transformation per transformed channel. In this case there are no differences between the transformations on the different channels. The FlowJo transformation is a biexponential but it maps input data into "channel space" from 0 to 4095. The output data are effectively binned. flowWorkspace interpolate the transformation function to give continuous values between 0 and 4095. ```{r echo=TRUE,eval=TRUE} raw<-seq(1,2.5e5,by=1000) #Input data range on the raw scale transformed<-lapply(getTransformations(G[[1]]),function(x)matrix(x(raw))) # transform with each transformation transformed<-do.call(cbind,transformed) # group output colnames(transformed)<-gsub("^ ","",names(getTransformations(G[[1]]))) transformed<-melt(transformed) transformed<-(cbind(transformed,raw)) setnames(transformed,c("index","parameter","transformed","raw")) ggplot(data.frame(raw,transformed))+geom_line(aes(x=raw,y=transformed))+facet_wrap(~parameter)+theme_minimal() ``` #### Compensation Matrices We can extract the compensation matrices from the `GatingSet` as well, and visualize them. ```{r eval=TRUE, echo=TRUE} getCompensationMatrices(G[[1]]) #Compensation for the first GatingHierarchy ggplot(melt(getCompensationMatrices(G[[1]])@spillover,value.name = "Coefficient"))+geom_tile(aes(x=Var1,y=Var2,fill=Coefficient))+scale_fill_continuous(guide="colourbar")+theme(axis.text.x=element_text(angle=45,hjust=1)) ``` #### Get the names of all manually derived pouplations The `getNodes` function will extract the node names (population names) of all populations defined in a GatingSet or a GatingHierarchy. ```{r eval=TRUE, echo=TRUE} getNodes(G) ``` This manually gated data has `r length(getNodes(G))` distinct populations. #### Additional arguments to `getNodes` and `getPopStats` The population names can be unwieldy and long. One particularly useful argument is the `path=` argument. You can set `path="auto"`, which will extract the node name as the shortest unique path name for each cell population. ```{r path_auto} getNodes(G,path="auto") ``` Operations that utilize node names can match a unique path to a node, so this gives an easy way to extract the shortest unique name for a cell population. #### Extracting cell population statistics `getPopStats` will extract cell population statistics, for a `GatingSet` or a `GatingHierarchy`. You can specify whether you want the cell counts or the cell frequencies. ```{r getPopStats} freq_gs<-getPopStats(G[1:5],statistic="freq") head(freq_gs) counts_gh<-getPopStats(G[1:5],statistic="count") head(counts_gh) ``` The output of `getPopStats` is a matrix wtih each entry representing the cell counts or cell frequencies (relative to the parent population) for a cell population and a sample. #### Plotting We can plot the gating tree for a sample or a set of samples using `plot`. To plot a gate use `plotGate`. ```{r plot} # The gating tree for this gating set plot(G[[1]],boolean=FALSE) ``` There is a lone IFNg+ gate floating, unattached to other nodes. If we look more closely at the node names, we see that its parent is "Not 4+", which is hidden. We can set attributes of a node using `setNode`. ```{r hiding_nodes,echo=FALSE,results="hide"} # setNode doesn't have a "GatingSet" interface, so we need to loop over all samples lapply(G,function(x)setNode(x,y = "Not 4+/IFNg+", FALSE)) #set the node as hidden plot(G) ``` Plotting gates is done using `plotGate`. The infrastructre will extract the event-level data and plot it directly. ```{r plotGate} #plot one specific gate from one sample p1<-plotGate(G[[2]],"4+/TNFa+",arrange=FALSE) #plot one gate from two samples - automatic faceting p2<-plotGate(G[2:3],"4+/TNFa+",arrange=FALSE) #oops! population names must be unique try(plotGate(G[1:2],"TNFa+")) #binning using xbin (or not) p4<-plotGate(G[[2]], "4+/TNFa+",xbin=128,arrange=FALSE) #smaller bins (default is 32), more detail # use gridExtra for layout require(gridExtra) grid.arrange(p1[[1]],p2,p4[[1]]) # Single plot is a list with element 1 being the plot object. Multiple plots are trellis objects directly. #plot an entire layout (boolean gates are skipped automatically) p3<-plotGate(G[[2]]) # a little slower since we read all the event-level data. Arrange is TRUE by default ``` The plotting functions use `package::lattice`, see the help on `?plotGate` for further options. #### Extracting Event-level data: the getData API Another way to extract event-level data from GatingSets and GatingHierarchy objects is via the `getData` API. This API extracts the selected events (e.g. events belonging to a certain population) into an `ncdfFlowSet`, `ncdfFlowFrame`, `flowSet`, or `flowFrame`. These are the standard `flowCore` objects for representing collections of events from samples, with each sample representing one or a subset of FCS files. ```{r getData} cd3 <- getData(G[[1]],"3+") cd3 ``` The above extracts the events that are CD3+ from the first sample of the gating set. The returned object is a `flowFrame`. See `?flowFrame` for more information. To grab multiple samples, pass a `GatingSet` ```{r getData_flowSet} cd3_set<-getData(G[1:10],"3+") cd3_set ``` You'll note the above is a different object, an `ncdfFlowSet` in this case, which is a set of flowFrames backed by a NetCDF file. The samples contained in a netCDF flowSet are tracked by a set of pointers or sample indices, so the same NetCDF file can be associated with multiple R objects, and each can contain subsets of the whole data set. For this reason, you should be careful about modifying the data on disk. It can have unintended consequences. ```{r ncdfFlowSet} cd3_set@file #where the NetCDF file lives flowData(G) #Return the entire ncdfFlowSet associated with a GatingSet flowData(G[1:5]) #Subsettig works, and still points to the same file. ``` #### getSingleCellExpression API We may want to grab the single-cell data for CD8+ T-cells that express IL2 or IFNg. This is relatively common, so there's an API for that: ```{r} # Get cells that express IL2 or IFNg and are CD8+. # map tells us how the node names map to markers on channels. sce<-getSingleCellExpression(G[1:5],c("8+/IL2+","8+/IFNg+"),map=list("8+/IL2+"="IL2","8+/IFNg+"="IFNg")) str(sce) # list of 5 n x 2 matrices. ``` #### Annotating the samples Much of the annotation information can be found in the keywords of the FCS files and the FlowJo workspace. We can extract these and annotate a GatingSet. It contains a `phenoData` slot that can be accessed by `pData` much like other BioConductor core objects. Importantly, these annotation can be used to group files for gating, for example by *subject*, or *stimulation and control*, *batch* and so forth. Other annotations may come from sites like *flowrepository.org*, and can also be imported from text files. ```{r keywords} require(flowIncubator) # the `getKeywords` function is in the `flowIncubator` package at http://www.github.com/RGLab/flowIncubator names(keyword(G[[1]])) # valid keywords keyword_vars<-c("$FIL","Stim","Sample Order","EXPERIMENT NAME") #relevant keywords pd<-data.table(getKeywords(G,keyword_vars)) #extract relevant keywords to a data table annotations<-data.table:::fread(ANNOTATIONS) # read the annotations from flowrepository setnames(annotations,"File Name","name") setkey(annotations,"name") setkey(pd,"name") head(pd) head(annotations) # We match on the name column, which matches the sampleNames(G). pd<-data.frame(annotations[pd]) #data.table style merge setnames(pd,c("Timepoint","Individual"),c("VISITNO","PTID")) #Rename Timepoint and Indivudual to VISITNO and PTID which are used in the template files pData(G)<-pd #annotate ``` ### OpenCyto Gating Now that you are more familiar with the basic objects in OpenCyto, we'll proceed to do some automated gating of the data, and we'll explore the consequences of modifying different arugments in the gating template. #### Cloning an existing GatingSet We need to clone the existing gating set containing the manually gated data if we want to perform OpenCyto automated gating and save our results. ```{r clone,message=FALSE} automated<-clone(G) ``` Next we need to remove the existing gates. `root` is the first node.. don't remove that. Remove the first population below it, which are singlets in this case "S". ```{r remove_nodes} Rm("S",automated) #Rm is the API to remove a node and all nodes beneath it. ``` `automated` is now a `GatingSet` containing only compensated and transformed data, with no gates. Next we will load the gating template. ```{r load_gating_template} gating_template<-gatingTemplate(TEMPLATE) ``` Don't worry about the warning at the end. Reading the template processes it and performs some expansion of populations like `cd8+/-` into `cd8+` and `cd8-`. We can view the gating template with `plot` ```{r plot_template} plot(gating_template) ``` Next we'll look at the first row of the template, the entry for the `boundary` filter. ```{r boundary,warning=FALSE} options("warn"=-1) head(fread(TEMPLATE)[1,]) ``` The `gating_method` is listed as `boundary`, if we look at the list of available gating methods we see it is one of the methods listed: ```{r available_methods} listgtMethods() ``` Other parameters are an `alias`, which is how the population is named in the gating tree, and `pop`, which tells `openCyto` whether to keep the events within the gate or outside the gate. In this case, it keeps the events within the gate by default. - The `parent` column specifies the name of the parent population, which is "root", since this is the first population we are defining. - The `dims` column defines the channels on which this gate is applied. Here, it is "FSC-A,SSC-A", which are the forward and side scatter dimensions. We can verify this by looking at the `flowFrame` from a sample. ```{r flowframe} getData(automated[[1]],use.exprs=FALSE) #use.exprs=FALSE don't retrieve events.. faster. ``` - The `gating_args` column are additional arguments to the `gating_method` (`boundary`). Here we tell the gating method that we want to include events up to a maximum of 2.5e5. Anything outside that range probably corresponds to cellular aggregates. ##### Why a boundary filter? We need to clean up the data because subsequent gating steps, which do some modeling, perform better if the data distribution is well behaved, i.e. smooth and doesn't have large spikes. #### How does the entry in the template look? Let's take a look at the gating template object ```{r template_boundary} gating_template@nodeData@data[["/boundary"]] # node named "/boundary" gating_template@edgeData@data[["root|/boundary"]] #edge from root to the boundary node str(gating_template@edgeData@data[["root|/boundary"]]$gtMethod) ``` We see the node has an "id", a "name", and an "alias", these are taken from the csv template. The edge information has an entry "gtMethod", which contains a `gtMethod` object that has all the info to run the gating method on a sample. It contains the dimensions to gate, the name of the population, the arguments to the method, and `groupBy` describing whether the data should be grouped by some `phenoData` variable, and `collapse` defining whether the data should be collapsed across multiple samples for gating. The gating template plot shows us that we have a population named "boundary" gated afte the "root" node using the "boundary" gating method (green arrow). #### Reference gates Reference gates are very useful and important to understand. They provide a way to define a gate and re-use that gate threshold later in the gating process. We'll show how they're used here to gate out debris. Let's look at the path from "viable" to "nonDebris". It goes through two additional populations, "nonNeutro", and "DebrisGate". ```{r} head(fread(TEMPLATE)[3:6,]) # Thre four entries in the csv template. ``` These consist of a one-dimensional `mindensity` gate to exclude dead cells (the gate on the "AViD" parameter which looks for a minimum density cutpoint in the data distribution between 500 and 1000). Then, two minimum density cutpoints on side-scatter and forward scatter, which remove high side-scatter events, and debris, respectively. The `gate_range` argument to `mindensity` allows you to specify a range of the data where you expect the separation between the populations to occur. This is why it is important to have well-standardized experiments. If you have lots of instrument variation, the data will have very different distributions from sample to sample, and automated gating approaches will have a very difficult time with such data. The final gate is a `refGate`, or "referenceGate". It is applied to the `viable` population on the forward scatter channel, and uses the threshold derived from the `DebrisGate` population. Why not just apply a mindensity gate on forward scatter directly to the live cells? The reason is that high side-scatter events sometimes get in the way of identifying a good cutpoint for debris in the forward scatter channel, so these are removed first, then the resulting cutpoint is used directly on the live cells. #### Two dimensional gates for lymphocytes OpenCyto can use multidimensional mixture modeling tools such as `flowClust` which was developed for flow cytometry automated gating. We use this most frequently to gate on lymphocytes. The definition of the lymphocyte population, named "lymph" is shown below in the csv. ```{r lymph_csv} head(fread(TEMPLATE)[7,]) # The lymph population in the csv template. ``` - The `dims` column lists two dimensions for gating. - The `pop` column lists one population with no + or - sign. This indicates to OpenCyto that this is a multidimensional gate. - The `gating_method` is "flowClust". A 1d and 2d version of this can be found in the `listgtMethods()` output, but openCyto chooses the correct one based on context. - `gating_args` include standard arguments to `flowClust`. In this case we specify `K=2` that we want to identify two populations, and `quantile=0.99` that we want to include 99% of cells in the gate. - `preprocessing_method` uses `prior_flowClust` which looks at the data before gating it and estimates a data-driven prior for the two populations. This used to speed up convergence when fitting the model to individual samples. The output is a single population named "lymph". By default, openCyto will select the population with highest proportion. However, if that's not desired, then you can pass a `target=c(x,y)` argument, where "x" and "y" are the approximate x and y location of the cell population of interest (usually it is fairly consistent across samples in an experiment). OpenCyto will look at the distance to this target of each cell population identified in each sample and return the one that is closest. #### CD8 and "shorthand" notation ```{r cd8_csv} head(fread(TEMPLATE)[9,]) # The four entries in the csv template. ``` Above, you see the definition of the CD8+/- cell subset. Some things you should note that are different from what we've seen so far: - There is no alias, but a "*" asterisk character. - An "min" argument which prefilters the data before computing the density. - A "gate_range" argument that specifies where to look for a cutpoint after computing the density. - Most fluorescence parameters from FlowJo are transformed into the range [0,4095], so the `gate_range` here covers about `r 100*(2800-1500)/4069` percent of the data range. - OpenCyto performs an "expansion" of the shorthand `pop` argument "cd8+/-" to define two populations: "cd8+" and "cd8-", named automatically, thus the empty `alias` field. We can see the expanded populations in the gating template graph. ```{r expanded_template_cd8} # Two nodes for the single input row "cd8+/-" gating_template@nodeData@data[["/boundary/singlet/viable/nonDebris/lymph/cd3/cd8+"]] gating_template@nodeData@data[["/boundary/singlet/viable/nonDebris/lymph/cd3/cd8-"]] ``` #### Putting it together to gate CD4 conditional on CD8 Things appear to get complex as we get to the CD4 and CD8 populations. But they're actually not so bad. Let's start by actually running the gating on the first few samples. ```{r run_gating_to_tnfa} #gate the first 5 samples nodes(gating_template) gating(gating_template,automated[1:10]) ``` Now let's plot the cd8-, cd4pos, and cd4 gates. ```{r plot_cd4} plotGate(automated[[1]],c("cd8-","cd4pos","cd4"),gpar=list(nrow=1)) #gpar adjusts graphical parameters so we can layout by row. ``` Now we see more clearly what is happening. The "cd8-" gate defines the set of CD8- cells (71% in the samples, first panel), which are passed into the "cd4pos" gate, so CD4 is gated conditional on CD8 (second panel). The thresholds from these two gates are composed via the "reference gate" to define CD4+/CD8- T-cells, whose parent is CD3 (third panel). ```{r echo=FALSE,results='hide'} plot(gating_template) ``` Similar approaches using "refGate" are used to gate Perforin, and Grazyme B in this data set. ### Using annotations to help in gating. We can use the annotations in `pData` to help OpenCyto group samples for gating. In this data set, which consists of antigen-stimulated T-cells and subject-matched non-stimulated controls, it makes sense to gate data by PTID (subject), and even by VISIT (subject visit). #### groupBy To group the data for gating by certain variables in the `pData`, add those variables to the `groupBy` column, using standard R notation for combining levels of factors. In this case, for the "TNFa" gate, we see that the `groupBy` column has the "PTID:VISIT" variables listed. So samples with unique levels of PTID and VISIT will be combined and passed to the gating routine. ```{r } fread(TEMPLATE)[14,] ``` #### cytokine gate (tailgate) The cytokines in this data set are gated using a routine called `cyokine` aka `tailgate`. These are interchangeable and alias the same routine. There are two parameters of particular importance: - `adjust` specifies the amount of smoothing when estimating the density. - `tol` is the tolerance within which the first derivative of the density is allowed to appraoch zero (i.e. how flat) before a threshold is drawn. The cytokine / tailgate is meant for use in a situation where there is one major cell population (perhaps a major negative peak), and very few positive cells, such that they may not register as a peak on a density estimate. The routine calculates the first derivative of the density and looks at the value on the (left or right) side of the density (`side="left"` or `side="right"` parameter). When the value is within `tol` of zero, it places a gate. In general the routine works better than a fixed quantile for discriminating rare cell populations, and allows for variable levels of background. #### Preprocessing via standardize_flowset The `cytokine` or `tailgate` is pared with a preprocessing routine named `standardize_flowset`. This routine will take all the samples within a group, estimate a set of data transformations that will standarize the samples to each other, collapse the transformed data and pass it along to the gating routine, along with the transformation functions, and the original data. The gating routine, if it receives the preprocessing results from standardize_flowset, will gate the collapsed data, then *back-transform* that gate location with the appropriate sample-specific transformation, and use that to gate the original sample-specific data. The above enables the grouped data to have a common gate threshold on the transformed scale, and on that transformed scale, the data are standardized, thus comparable across samples. #### Tailgate without preprocessing via standardize_flowset If the preprocessing argument is omitted, then gating proceeds immediately using `tailgate` or `cytokine` gate. In order to pass grouped and collapsed samples on to the gating routine, you need to specify both the `groupBy` and `collapseDataForGating` columns in the template. Then a single collapsed flowSet is passed on to tailgate (without standardization), and each sample will share a common gate threshold on the untransformed scale. #### Grouping and collapsing In general `groupBy` specifies how the data are grouped before being passed to preprocessing if preprocessing is performed. For example, for flowClust, the groups could specify sets of samples that should be used to estimate a common prior. The `collapseDataForGating` columns specifies whether the grouped data should be collapsed into a single flowFrame before being passed on to the gating routine. If this is the case, then a common threshold is estimated for the set of samples. An example of this is the CD57+/- set of populations. ```{r } fread(TEMPLATE)[20,] ``` The gate above uses mindensity to find a threshold for the CD57+ vs CD57- populations. It finds a common threshold for unique levels of PTID and VISIT by setting the `groupBy` column with `collapseDataForGating=TRUE`. It does no preprocessing. `standardize_flowset` is the only exception to the above. In order to get a common threshold for a group of samples on the *standarized* scale, you must specify a value for `groupBy` but set `collapseDataForGating` to `FALSE`, otherwise an error will be thrown. ### Some rules about specifying population names in the `pop` column Plus (+) and minus (-) signs are reserved characters. Your channel and marker names must not include + or - signs as these are interpreted as population modifiers that tell OpenCyto whether to keep the positive or negative part of a population (events in our out of the gate). If your channel names have special characters like these, use marker names in the `pop` column instead. ### Plug-in gating methods You can plug in your own routines for `gating_method` and `preprocessing_method` by using the `registerPlugins` API. This looks like the following: ```{r } args(registerPlugins) ?registerPlugins ``` The registration routine requires a function `fun`, a name `methodName`, and a `dep` argument that specifies any library dependencies (for example if the gating routine requires functionality from a different package). Finally there are additional arguments that specify whether the routine is a "gating" or "preprocessing" routine. The `fun` argument is a function that wraps around a gating routine and gives it a common interface. A gating method should have the following formal arguments: - `fr` a flowFrame - `pp_res` the output of preprocessing - `xChannel` a character (optional) - `yChannel` a character (required) - `filter_id` a character that identifies the resulting filter. - `...` additional parameters for the method. A preprocessing method should have the following arguments: - `fs` a flow set that stores the flow data, possibly output by `groupBy` - `gs` a gating set - `gm` a `gtMethod` object that stores information about the gating method - `xChannel` - `yChannel` - `...` additional arguments for the preprocessing method. Within the wrapper, you can extract the information you need to pass on to your gating or preprocessing routine. As defined by the formal arguments, you have access to the data in `fr`, the output of any preprocessing in `pp_res`, the channels via `xChannel` and `yChannel` (`yChannel` is always specified, and `xChannel` may be `NULL`), and the `filter_id` which you can use to name the gate that you return to OpenCyto. Within the preprocessing you have access to the flowSet of data, the GatingSet which allows you to access other annotations, the gating method object via `gm`, and the channels to use for gating. This allows the plugin-writer a considerable amount of flexibility and creativity to process the data as they please. For example, we can envision a plugin that would use FMO controls to set the gate thresholds. The preprocessing could match the `xChannel` and `yChannel` with FMO controls from the gating set in `gs`, and gate those controls and pass the thresholds as part of `pp_res` on to the gating method. The gating method would construct the actual gates, or use those thresholds to refine some more complicated gating scheme. The `gating function` *must* return a `fiter` object, defined in `flowCore`, or may instantiate its own type of `filter`. Most commonly, we return a `polygonGate`. Finally, wrapper functions *must* begin with a dot (.) - So if you have a gating method you call "myMethod" in the `methodName` argument, the function passed to the `fun` argument should be named .myMethod If registration succeeds, `registerPlugins` returns `TRUE`. #### Plugin example - gating a stimulated sample based on a negative control The following plugin will collapse negative controls and stimulated samples per subject based on variables in the `groupBy` template column. It will then use a `control` column in the `phenoData` of the `flowSet` to identify which events belong to the control sample and which belong to the stimualted sample. The gating routine will subset the control events, define a gate threshold and apply it to all samples in the group, including the stimulated samples. ```{r plugin_preprocessing} # The gate method will subset the flowFrame on the "control" events and gate those, returning a "negative control" gate. # This could work equally well with FMOs. .negGate <- function(fr, pp_res, xChannel=NA, yChannel=NA, filterId="ppgate", ...){ my_gate <- tailgate(fr[pp_res,],channel=yChannel, filter_id=filterId, ...) return(my_gate) } registerPlugins(fun=.negGate,methodName='negGate',dep=NA) # Identifies which events belong to the negative control sample vs the treatment sample. # groupBy includes PTID # Treatment information comes from the pData of the flowSet # passes a vector of 0/1 for treamtent / control events .ppnegGate <- function(fs, gs, gm, xChannel, yChannel, groupBy, isCollapse, ...) { d <- c() for(i in c(1:length(fs))) { d = c(d,rep.int(pData(fs[i])$control,nrow(exprs(fs[[i]])))) } return(as.logical(d)) } registerPlugins(fun=.ppnegGate, methodName='ppnegGate', dep=NA, "preprocessing") ```