AnnotationHub is a new approach to providing annotation resources to the Bioconductor community. The initial plan is to make available fasta, GFF / GTF, BED, VCF, and similar files from entities such as Ensembl, UCSC, ENCODE, and 1000 Genomes projects as objects ready for work in Bioconductor, e.g., GRanges representations of BED files.
As an example, the following lines discover and retrieve an ENCODE ‘narrowPeaks’ file as a GRanges representation.
library(AnnotationHub) hub <- AnnotationHub() ## data exploration length(names(hub)) # resources available md <- metadata(hub) # DataFrame hub$goldenpath.hg19.encodeDCC<tab> # tab completion ## retrieval res <- hub$goldenpath.hg19.encodeDCC.wgEncodeUwTfbs.wgEncodeUwTfbsNhlfCtcfStdPkRep1.narrowPeak_0.0.1.RData class(res) # GRanges representation of ENCODE bed file
AnnotationHub is currently (27 January, 2013) available from the Bioconductor subversion repository for use with the development version of R; our intention is to include AnnotationHub in Bioconductor 2.12.