1 CoSIAdata Introduction

VST Stabilized RNA-Seq Expression Data from Bgee across six species and more than 132 tissues are made available through the CoSIAdata package. The six species available through the package are Homo sapiens, Mus musculus, Rattus norvegicus, Danio rerio, Drosophila melanogaster, and Caenorhabditis elegans. Each species is found in an individualized Rdata file from ExperimentHub and can be used in conjunction with CoSIA, a visualization tool for comparing across species using gene expression metrics. CoSIAdata’s individualized datasets provide the Anatomical Entity Name, Anatomical Entity Id, Ensembl Id, and Experimental Id to accompany the VST Stabilized RNA-Seq Expression Data allowing for species, tissue, and gene-specific analysis to be conducted.

The example below demonstrates the process of downloading these datasets from Experimental Hub.

##Installation of CoSIAdata In R:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("CoSIAdata")
library(CoSIAdata)

1.1 Using CoSIAdata

CoSIAdata has species-specific helper functions for accessing expression data

c_elegans_vst_counts <- CoSIAdata::Caenorhabditis_elegans()

1.2 Behind the Scenes of CoSIAdata

CoSIAdata retrieves from ExperimentHub using the query functions

eh <- ExperimentHub()
query(eh, "CoSIAdata")
head(eh[["EH7863"]])

1.3 Accessing CoSIAdata Metadata

To get a list of species in CoSIAdata and other information about the datasets, query ExperimentHub as below

eh <- ExperimentHub::ExperimentHub()
AnnotationHub::query(eh, "CoSIAdata")
#> ExperimentHub with 6 records
#> # snapshotDate(): 2023-10-24
#> # $dataprovider: Bgee
#> # $species: Rattus norvegicus, Mus musculus, Homo sapiens, Drosophila melano...
#> # $rdataclass: data.frame
#> # additional mcols(): taxonomyid, genome, description,
#> #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#> #   rdatapath, sourceurl, sourcetype 
#> # retrieve records with, e.g., 'object[["EH7858"]]' 
#> 
#>            title                                                              
#>   EH7858 | VST normalized RNA-Sequencing data with annotations for 362,533,...
#>   EH7859 | VST normalized RNA-Sequencing data with annotations for 31,405,6...
#>   EH7860 | VST normalized RNA-Sequencing data with annotations for 3,814,42...
#>   EH7861 | VST normalized RNA-Sequencing data with annotations for 5,235,72...
#>   EH7862 | VST normalized RNA-Sequencing data with annotations for 4,576,39...
#>   EH7863 | VST normalized RNA-Sequencing data with annotations for 1,923,06...

Session Info

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 22.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] BiocStyle_2.30.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] KEGGREST_1.42.0               xfun_0.40                    
#>  [3] bslib_0.5.1                   Biobase_2.62.0               
#>  [5] bitops_1.0-7                  vctrs_0.6.4                  
#>  [7] tools_4.3.1                   generics_0.1.3               
#>  [9] stats4_4.3.1                  curl_5.1.0                   
#> [11] tibble_3.2.1                  fansi_1.0.5                  
#> [13] AnnotationDbi_1.64.0          RSQLite_2.3.1                
#> [15] blob_1.2.4                    pkgconfig_2.0.3              
#> [17] dbplyr_2.3.4                  S4Vectors_0.40.0             
#> [19] GenomeInfoDbData_1.2.11       lifecycle_1.0.3              
#> [21] compiler_4.3.1                Biostrings_2.70.1            
#> [23] GenomeInfoDb_1.38.0           httpuv_1.6.12                
#> [25] htmltools_0.5.6.1             sass_0.4.7                   
#> [27] RCurl_1.98-1.12               yaml_2.3.7                   
#> [29] interactiveDisplayBase_1.40.0 pillar_1.9.0                 
#> [31] later_1.3.1                   crayon_1.5.2                 
#> [33] jquerylib_0.1.4               ellipsis_0.3.2               
#> [35] cachem_1.0.8                  mime_0.12                    
#> [37] ExperimentHub_2.10.0          AnnotationHub_3.10.0         
#> [39] tidyselect_1.2.0              digest_0.6.33                
#> [41] purrr_1.0.2                   dplyr_1.1.3                  
#> [43] bookdown_0.36                 BiocVersion_3.18.0           
#> [45] fastmap_1.1.1                 cli_3.6.1                    
#> [47] magrittr_2.0.3                utf8_1.2.4                   
#> [49] withr_2.5.1                   filelock_1.0.2               
#> [51] promises_1.2.1                rappdirs_0.3.3               
#> [53] bit64_4.0.5                   rmarkdown_2.25               
#> [55] XVector_0.42.0                httr_1.4.7                   
#> [57] bit_4.0.5                     png_0.1-8                    
#> [59] memoise_2.0.1                 shiny_1.7.5.1                
#> [61] evaluate_0.22                 knitr_1.44                   
#> [63] IRanges_2.36.0                BiocFileCache_2.10.0         
#> [65] rlang_1.1.1                   Rcpp_1.0.11                  
#> [67] xtable_1.8-4                  glue_1.6.2                   
#> [69] DBI_1.1.3                     BiocManager_1.30.22          
#> [71] BiocGenerics_0.48.0           jsonlite_1.8.7               
#> [73] R6_2.5.1                      zlibbioc_1.48.0