Use case: Systems biology of Influenza

Reproducing results of a publication. Systems biology of vaccination for seasonal influenza in humans. Nakaya et al (2011).
The paper uses system biology approach to study immune response to vaccination against influenza in three seasons.
Two cohorts vaccinated with TIV (2007 & 2008). And one with LAIV (2008).


Without ImmuneSpaceR

Fetching data from GEO

The data was made available on GEO after publication. The SuperSeries is referenced in the paper.

library(GEOquery)
gse <- getGEO("GSE29619")

The returned object is a list of 3 ExpressionSets. One per SubSeries/cohort.

gse
## $`GSE29619-GPL13158_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 54715 features, 163 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: GSM733843 GSM733844 ... GSM734021 (163 total)
##   varLabels: title geo_accession ... data_row_count (43 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1007_PM_s_at 1053_PM_at ... AFFX-TrpnX-M_at (54715
##     total)
##   fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
##     total)
##   fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL13158 
## 
## $`GSE29619-GPL3921_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 22277 features, 84 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: GSM734022 GSM734023 ... GSM734105 (84 total)
##   varLabels: title geo_accession ... data_row_count (38 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (22277
##     total)
##   fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
##     total)
##   fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL3921 
## 
## $`GSE29619-GPL570_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 54675 features, 27 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: GSM733816 GSM733817 ... GSM733842 (27 total)
##   varLabels: title geo_accession ... data_row_count (43 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (54675
##     total)
##   fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
##     total)
##   fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL570

The seasons cannot be identified without looking in the objects.

names(gse)
## [1] "GSE29619-GPL13158_series_matrix.txt.gz"
## [2] "GSE29619-GPL3921_series_matrix.txt.gz" 
## [3] "GSE29619-GPL570_series_matrix.txt.gz"

phenoData

Let’s look at the metadata available on GEO

library(Biobase)
es_LAIV <- gse[[1]]
head(pData(es_LAIV), 3)
##                                                   title geo_accession
## GSM733843 2008 LAIV subject ID 1 at D0 post-vaccination     GSM733843
## GSM733844 2008 LAIV subject ID 1 at D3 post-vaccination     GSM733844
## GSM733845 2008 LAIV subject ID 1 at D7 post-vaccination     GSM733845
##                          status submission_date last_update_date type
## GSM733843 Public on Jul 10 2011     May 29 2011      Jul 10 2011  RNA
## GSM733844 Public on Jul 10 2011     May 29 2011      Jul 10 2011  RNA
## GSM733845 Public on Jul 10 2011     May 29 2011      Jul 10 2011  RNA
##           channel_count
## GSM733843             1
## GSM733844             1
## GSM733845             1
##                                                                                                                                              source_name_ch1
## GSM733843 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
## GSM733844 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
## GSM733845 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
##           organism_ch1 characteristics_ch1 characteristics_ch1.1
## GSM733843 Homo sapiens     cell type: PBMC         subject id: 1
## GSM733844 Homo sapiens     cell type: PBMC         subject id: 1
## GSM733845 Homo sapiens     cell type: PBMC         subject id: 1
##           characteristics_ch1.2               characteristics_ch1.3
## GSM733843        time point: D0 vaccine: LAIV (FluMist,  MedImmune)
## GSM733844        time point: D3 vaccine: LAIV (FluMist,  MedImmune)
## GSM733845        time point: D7 vaccine: LAIV (FluMist,  MedImmune)
##                                          characteristics_ch1.4
## GSM733843 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
## GSM733844 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
## GSM733845 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
##                                           characteristics_ch1.5
## GSM733843 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
## GSM733844 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
## GSM733845 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
##                                                    characteristics_ch1.6
## GSM733843 hai titer (day 0) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
## GSM733844 hai titer (day 0) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
## GSM733845 hai titer (day 0) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
##                                                     characteristics_ch1.7
## GSM733843 hai titer (day 28) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
## GSM733844 hai titer (day 28) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
## GSM733845 hai titer (day 28) - a/uruguay/716/2007  nymc x-175c (h3n2): 40
##                              characteristics_ch1.8
## GSM733843 hai titer (day 0) - b/florida 4/2006: 20
## GSM733844 hai titer (day 0) - b/florida 4/2006: 20
## GSM733845 hai titer (day 0) - b/florida 4/2006: 20
##                               characteristics_ch1.9 molecule_ch1
## GSM733843 hai titer (day 28) - b/florida 4/2006: 40    total RNA
## GSM733844 hai titer (day 28) - b/florida 4/2006: 40    total RNA
## GSM733845 hai titer (day 28) - b/florida 4/2006: 40    total RNA
##                                                                                                                                                                                                                                                                                               extract_protocol_ch1
## GSM733843 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
## GSM733844 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
## GSM733845 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
##           label_ch1
## GSM733843    biotin
## GSM733844    biotin
## GSM733845    biotin
##                                                                                                                                                                                                                                                                                                                                                                          label_protocol_ch1
## GSM733843 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
## GSM733844 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
## GSM733845 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
##           taxid_ch1
## GSM733843      9606
## GSM733844      9606
## GSM733845      9606
##                                                                                                                                                                                                                                                                             hyb_protocol
## GSM733843 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
## GSM733844 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
## GSM733845 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
##                                                                                                                                                                    scan_protocol
## GSM733843 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
## GSM733844 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
## GSM733845 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
##              description
## GSM733843 2008-LAIV-1-D0
## GSM733844 2008-LAIV-1-D3
## GSM733845 2008-LAIV-1-D7
##                                                                                            data_processing
## GSM733843 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
## GSM733844 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
## GSM733845 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
##           platform_id    contact_name     contact_email  contact_phone
## GSM733843    GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
## GSM733844    GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
## GSM733845    GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
##           contact_laboratory   contact_department contact_institute
## GSM733843     Bali Pulendran Emory Vaccine Center  Emory University
## GSM733844     Bali Pulendran Emory Vaccine Center  Emory University
## GSM733845     Bali Pulendran Emory Vaccine Center  Emory University
##                        contact_address contact_city contact_state
## GSM733843 954 Gatewood Road, room 2040      Atlanta            GA
## GSM733844 954 Gatewood Road, room 2040      Atlanta            GA
## GSM733845 954 Gatewood Road, room 2040      Atlanta            GA
##           contact_zip/postal_code contact_country
## GSM733843                   30329             USA
## GSM733844                   30329             USA
## GSM733845                   30329             USA
##                                                                                           supplementary_file
## GSM733843 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733843/GSM733843.CEL.gz
## GSM733844 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733844/GSM733844.CEL.gz
## GSM733845 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733845/GSM733845.CEL.gz
##                                                                                         supplementary_file.1
## GSM733843 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733843/GSM733843.chp.gz
## GSM733844 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733844/GSM733844.chp.gz
## GSM733845 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733845/GSM733845.chp.gz
##           data_row_count
## GSM733843          54715
## GSM733844          54715
## GSM733845          54715

Issues

  • The metadata is poorly formated
  • Subject ID & visit in the same column
  • HAI results mapped using “characteristic_ch1” columns
  • No demographics (age, gender, race)
  • No mapping to additional datasets (ELISA, ELISPOT, Flow, GE, etc…)

Combining data

Without any additional information the various seasons cannot be combined.

combine(gse[[1]], gse[[2]])
## Error in combine(gse[[1]], gse[[2]]): objects have different annotations: GPL13158, GPL3921

The platforms are different accross seasons.

Summary

In order to do something simple like differential gene expression analyses:

  • Parse the phenoData of each ExpressionSet for visit and patient ID
  • Use annotate to map probes to genes
  • combine into one expression matrix
  • Run DGEA (limma)

Combining this information with another dataset would require additional data cleaning (assuming the IDs are consistent accros experiment).


With ImmuneSpaceR

This study was funded by HIPC. As a result, all of its data has been curated and standardized by ImmPort and is now publicly available on ImmuneSpace. We will use ImmuneSpaceR to retrieve the gene-expression data and associated metadata.

Connect to ImmuneSpace

The study SDY269 holds the data for the two 2008 cohorts, LAIV and TIV.
A study overview page on ImmuneSpace


First, we instantiate a connection to the study of interest.

library(ImmuneSpaceR)
sdy269 <- CreateConnection("SDY269")
sdy269
## Immunespace Connection to study SDY269
## URL: https://www.immunespace.org/Studies/SDY269
## User: unknown_user at not_a_domain.com
## Available datasets
##  demographics
##  pcr
##  cohort_membership
##  elisa
##  elispot
##  hai
##  gene_expression_files
##  fcs_analyzed_result
##  fcs_sample_files
## Expression Matrices
##  TIV_2008
##  LAIV_2008

The returned ImmuneSpaceConnection object shows available datasets and expression matrices available for the selected study. ImmuneSpace uses Reference classes or R5. Functions are members of the object, thus the $ semantics to access member functions in the following samples.

Download gene expression

TIV2008 <- sdy269$getGEMatrix("TIV_2008")
TIV2008
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 54715 features, 80 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: BS586131 BS586187 ... BS586267 (80 total)
##   varLabels: biosample_accession participant_id ...
##     study_time_collected_unit (5 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1007_PM_s_at 1053_PM_at ... AFFX-r2-TagQ-5_at
##     (54715 total)
##   fvarLabels: FeatureId gene_symbol
##   fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'
## Annotation:
head(pData(TIV2008))
##          biosample_accession participant_id         cohort
## BS586131            BS586131  SUB112832.269 TIV Group 2008
## BS586187            BS586187  SUB112832.269 TIV Group 2008
## BS586243            BS586243  SUB112832.269 TIV Group 2008
## BS586128            BS586128  SUB112834.269 TIV Group 2008
## BS586240            BS586240  SUB112834.269 TIV Group 2008
## BS586132            BS586132  SUB112841.269 TIV Group 2008
##          study_time_collected study_time_collected_unit
## BS586131                    0                      Days
## BS586187                    3                      Days
## BS586243                    7                      Days
## BS586128                    0                      Days
## BS586240                    7                      Days
## BS586132                    0                      Days

Only essential information is displayed in the phenoData,

Do I have to do this for every cohort?

getGEMatrix accepts multiple IDs as input. Additionally setting summary = TRUE returns the matrix with expression values averaged by gene. Allowing an easy merge of the two ExpressionSets.

es269 <- sdy269$getGEMatrix(c("TIV_2008", "LAIV_2008"), summary = TRUE)
es269
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 16910 features, 163 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: BS586131 BS586187 ... BS586111 (163 total)
##   varLabels: biosample_accession participant_id ...
##     study_time_collected_unit (5 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1 2 ... 16910 (16910 total)
##   fvarLabels: FeatureId gene_symbol
##   fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'
## Annotation:

For convenience, cohorts can also be downloaded by names.

es269 <- sdy269$getGEMatrix(cohort = c("TIV Group 2008", "LAIV group 2008"), summary = TRUE)

Do I have to do this for every study?

SDY269 only contains the two cohorts asociated with the 2008 season but the GEO SuperSeries had three cohorts. The data has been split between multiple studies. SDY61 contains the data for the 2007 season. We have to combine the gene-expression results accross these two studies to get the data that was used in the publication.

In order to query multiple studies, we can instantiate a connection to all studies at once.

all <- CreateConnection("") #All studies
es <- all$getGEMatrix(c("TIV_2007", "TIV_2008", "LAIV_2008"), summary = TRUE)
head(pData(es))
##          biosample_accession participant_id         cohort
## BS586015            BS586015   SUB112830.61 TIV Group 2007
## BS586024            BS586024   SUB112830.61 TIV Group 2007
## BS586033            BS586033   SUB112830.61 TIV Group 2007
## BS586016            BS586016   SUB112832.61 TIV Group 2007
## BS586025            BS586025   SUB112832.61 TIV Group 2007
## BS586034            BS586034   SUB112832.61 TIV Group 2007
##          study_time_collected study_time_collected_unit
## BS586015                    0                      Days
## BS586024                    3                      Days
## BS586033                    7                      Days
## BS586016                    0                      Days
## BS586025                    3                      Days
## BS586034                    7                      Days

With a single command, we can download the three relevant normalized and transformed expression matrices, summarize them by gene and combine into a unique ExpressionSet with relevant phenoData.
It is also a lot faster than GEO as each matrix is available as a single file.

How about the other datasets?

Unlike expresssion matrices, the other datasets are stored as tables in the database.

SDY269 HAI dataset as visible on ImmuneSpace


Downloading

They can be accessed using the getDataset function.

library(data.table)
sdy269$listDatasets()
## datasets
##  demographics
##  pcr
##  cohort_membership
##  elisa
##  elispot
##  hai
##  gene_expression_files
##  fcs_analyzed_result
##  fcs_sample_files
## Expression Matrices
##  TIV_2008
##  LAIV_2008
hai <- sdy269$getDataset("hai")
hai
##      participant_id age_reported gender  race          cohort
##   1:  SUB112847.269           24   Male White  TIV Group 2008
##   2:  SUB112860.269           32 Female White LAIV group 2008
##   3:  SUB112842.269           28   Male White  TIV Group 2008
##   4:  SUB112859.269           32   Male White LAIV group 2008
##   5:  SUB112878.269           28 Female White  TIV Group 2008
##  ---                                                         
## 332:  SUB112866.269           34   Male White LAIV group 2008
## 333:  SUB112867.269           37   Male White LAIV group 2008
## 334:  SUB112883.269           23 Female Asian LAIV group 2008
## 335:  SUB112848.269           39 Female White  TIV Group 2008
## 336:  SUB112850.269           41   Male White LAIV group 2008
##      study_time_collected study_time_collected_unit
##   1:                   28                      Days
##   2:                    0                      Days
##   3:                   28                      Days
##   4:                    0                      Days
##   5:                   28                      Days
##  ---                                               
## 332:                    0                      Days
## 333:                    0                      Days
## 334:                   28                      Days
## 335:                    0                      Days
## 336:                   28                      Days
##                                       virus value_reported
##   1: A/Uruguay/716/2007  NYMC X-175C (H3N2)            320
##   2:           A/South Dakota/6/2007 (H1N1)             40
##   3:                      B/Brisbane/3/2007             80
##   4:           A/South Dakota/6/2007 (H1N1)              5
##   5: A/Uruguay/716/2007  NYMC X-175C (H3N2)            160
##  ---                                                      
## 332:           A/South Dakota/6/2007 (H1N1)             40
## 333:                       B/Florida/4/2006              5
## 334:           A/South Dakota/6/2007 (H1N1)             20
## 335:              A/Brisbane/59/2007 (H1N1)            640
## 336:                       B/Florida/4/2006              5

For performance, the function returns data.table objects. It is especially interesting when querying accross all studies.

ahai <- all$getDataset("hai")

Dataset filtering

Because getDataset is a wrapper around Rlabkey’s labkey.selectRows function, we use makeFilter to create filters that can be interpreted by LabKey.

library(Rlabkey)
virus_filter <- makeFilter(c("virus", "CONTAINS", "H1N1"))
hai_f <- sdy269$getDataset("hai", colFilter = virus_filter)
virus_filter2 <- makeFilter(c("virus", "EQUAL", "A/Brisbane/59/2007 (H1N1)"))
hai_f <- sdy269$getDataset("hai", colFilter = virus_filter2)

#multiple filters can be specified
analyte_filter <- makeFilter(c("Analyte", "EQUAL", "IFNg"), c("Study time collected", "IN", "0;7"))
elisa <- sdy269$getDataset("elisa", colFilter = analyte_filter)

Cross assay analysis

With the same information accross datasets, we can merge results from various assay types.
Here we use the elispot and flow cytometry results to reproduce Figure 1d of Nakaya et al.

# Elispot
analyte_filter2 <- makeFilter(c("Analyte", "EQUAL", "IgG"), c("Study time collected", "EQUAL", "7"))
elispot <- sdy269$getDataset("elispot", colFilter = analyte_filter2, reload = TRUE)
elispot <- elispot[, elispot_response := spot_number_reported + 1]
elispot <- elispot[, list(participant_id, elispot_response)]

# Flow
fcs <- sdy269$getDataset("fcs_analyzed_result")
fcs <- fcs[, fcs_response := (as.double(population_cell_number) + 1) / as.double(base_parent_population)][study_time_collected == 7]
res <- merge(elispot, fcs, by = "participant_id")

library(ggplot2)
ggplot(res, aes(x = as.double(fcs_response), y = elispot_response, color = cohort)) +
  geom_point() + scale_y_log10() + scale_x_log10() + geom_smooth(method = "lm") + 
  xlab("Total plasmablasts (%)") + ylab("Influenza specific cells\n (per 10^6 PBMCs)") +
  theme_IS()

Only a few lines were needed to calculate fold-change and a sinle merge operation is enough to combine data accross assays and bind metadata.


What else?

Quick plot

For the exploration of most datasets, quick_plot is a function that takes advantage of the standardized datasets to plot the data in a relevant fashion. It has a limited number of options and should be used for having a first look at the data.
More advanced plots can be achieved by downloading the data in the R session, as seen in the use case above.

sdy269$quick_plot("hai", normalize = FALSE)

sdy269$quick_plot("hai", filter = virus_filter2, normalize = FALSE, color = "Age", shape = "Gender")

This is the same function that is used by the interactive Data Explorer Module.

virus_filter3 <- makeFilter(c("cohort", "contains", "TIV"), c("study_time_collected", "IN", "0;21;28;30;180"))
all$quick_plot("hai", filter = virus_filter3, normalize = TRUE, color = "Age")

Online R/Rmd reports

The reports available on ImmuneSpace are written using ImmuneSpaceR. The code is available and will work both through LabKey’s report on the web portal and locally.

Example of live report available in SDY180


Technical details

  • Reference classes for performance and caching
  • Connections to the database are made with RCurl
  • data.table for handling downloaded datasets

Future

  • Rework the instantiation to allow connection to a subset of studies.
  • Use Data Finder filters in R.
  • Integration of Rstudio to ImmuneSpace (spawn pre-filtered R sessions).
  • Accept user made R/Rmd reports on ImmuneSpace.
  • Improve testing of both ImmuneSpaceR & ImmuneSpace.