1. Anatomy of a MPSE
MPSE S4 class. This class inherits the
SummarizedExperiment(Morgan et al. 2021) class. Here, the
assays slot is used to store the rectangular abundance matrices of features for a microbiome experimental results. The
colData slot is used to store the meta-data of sample and some results about samples in the downstream analysis. The
rowData is used to store the meta-data of features and some results about the features in the downstream analysis. Compared to the
MPSE introduces the following additional slots:
- taxatree: is a
treedata(Wang et al. 2020; Yu 2021) class contained phylo class (hierarchical structure) and tibble class (associated data) to store the taxonomy information, the tip labels of taxonomy tree are the rows of the
assays, but the internal node labels contain the differences level taxonomy of the rows of the
assays. The tibble class contains the taxonomy classification of node labels.
- otutree: is also a
treedataclass to store the phylogenetic tree (based with reference sequences) and the associated data, which its tip labels are also the rows of the assays.
- refseq: is a
XStringSet(Pagès et al. 2021) class contained reference sequences, which its names are also identical with the rows of the assays.
2. Overview of the design of MicrobiotaProcess package
With this data structure,
MicrobiotaProcess will be more interoperable with the existing computing ecosystem. For example, the slots inherited
SummarizedExperiment can be extracted via the methods provided by
otutree can also be extracted via
mp_extract_tree, and they are compatible with
ggtree(Yu et al. 2017),
ggtreeExtra(Xu et al. 2021),
treeio(Wang et al. 2020) and
tidytree(Yu 2021) ecosystem since they are all
treedata class, which is a data structure used directly by these packages.
Moreover, the results of upstream analysis of microbiome based some tools, such as
qiime2(Bolyen et al. 2019),
dada2(Callahan et al. 2016) and
MetaPhlAn(Beghini et al. 2021) or other classes (
SummarizedExperiment(Morgan et al. 2021),
phyloseq(McMurdie and Holmes 2013) and
TreeSummarizedExperiment(Huang et al. 2021)) used to store the result of microbiome can be loaded or transformed to the
MicrobiotaProcess also introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome analysis procedures under a unified and common framework (tidy-like framework). We believe
MicrobiotaProcess can improve the efficiency of related researches, and it also bridges microbiome data analysis with the
tidyverse(Wickham et al. 2019).