AbstractTximeta performs numerous annotation and metadata gathering tasks on behalf of users during the import of transcript quantifications from Salmon or alevin into R/Bioconductor. Metadata and transcript ranges are added automatically, facilitating genomic analyses and assisting in computational reproducibility.
tximeta package (Love et al. 2020) extends the
tximport package (Soneson, Love, and Robinson 2015) for import of transcript-level quantification data into R/Bioconductor. It automatically adds annotation metadata when the RNA-seq data has been quantified with Salmon (Patro et al. 2017) or for scRNA-seq data quantified with alevin (Srivastava et al. 2019). To our knowledge,
tximeta is the only package for RNA-seq data import that can automatically identify and attach transcriptome metadata based on the unique sequence of the reference transcripts. For more details on these packages – including the motivation for
tximeta and description of similar work – consult the References below.
tximeta requires that the entire output directory of Salmon / alevin is present and unmodified in order to identify the provenance of the reference transcripts. In general, it’s a good idea to not modify or re-arrange the output directory of bioinformatic software as other downstream software rely on and assume a consistent directory structure. For sharing multiple samples, one can use, for example,
tar -czf to bundle up a set of Salmon output directories, or to bundle one alevin output directory. For tips on using
tximeta with other quantifiers see the other quantifiers section below.