PhyloProfile: dynamic visualization and exploration of multi-layered phylogenetic profiles
PhyloProfile 1.21.5
Phylogenetic profiles capture the presence - absence pattern of genes across species (Pellegrini et al., 1999). The presence of an ortholog in a given species is often taken as evidence that also the corresponding function is represented (Lee et al., 2007). Moreover, if two genes agree in their phylogenetic profile, it can suggest that they functionally interact (Pellegrini et al., 1999). Phylogenetic profiles are therefore commonly used for tracing functional protein clusters or metabolic networks across species and through time. However, orthology inference is not error-free (Altenhoff et al., 2016), and orthology does not guarantee functional equivalence for two genes (Studer and Robinson-Rechavi, 2009). Therefore, phylogenetic profiles are often integrated with accessory information layers, such as sequence similarity, domain architecture similarity, or semantic similarity of Gene Ontology-term descriptions.
Various approaches exist to visualize such profiles. However, there is still a shortage of tools that provide a comprehensive set of functions for the display, filtering and analysis of multi-layered phylogenetic profiles comprising hundreds of genes and taxa. To close this methodological gap, we present here PhyloProfile, an R-based tool to visualize, explore and analyze multi-layered phylogenetic profiles.
To install the PhyloProfile package via Bioconductor using BiocManager:
if (!requireNamespace("BiocManager"))
install.packages("BiocManager")
BiocManager::install("PhyloProfile")
To install the dev version from Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version='devel')
BiocManager::install("PhyloProfile")
To install the dev version from github:
if (!requireNamespace("devtools"))
install.packages("devtools", repos = "http://cran.us.r-project.org")
devtools::install_github(
"BIONF/PhyloProfile",
INSTALL_opts = c('--no-lock'),
build_vignettes = TRUE
)
Or use directly the online version at http://applbio.biologie.uni-frankfurt.de/phyloprofile/.
PhyloProfile expects as a main input the phylogenetic distribution of orthologs, or more generally of homologs. This information can be complemented with domain architecture annotation and data for up to two additional annotation layers.
Beside tab delimited text and sequences in FASTA format, the tool also accepts orthoXML (Schmitt et al., 2011), or a list of OMA IDs (Altenhoff et al., 2015) as input.
Here is an example of a tab delimited input with two additional annotation layers:
geneID | ncbiID | orthoID | FAS_F | FAS_B |
---|---|---|---|---|
100136at6656 | ncbi36329 | 100136at6656|PLAF7@36329@1|Q8ILT8|1 | 0.9875289 | 0.8427314 |
100136at6656 | ncbi319348 | 100136at6656|POLVAN@319348@0|319348_0:004132|1 | 1.0000000 | 1.0000000 |
100136at6656 | ncbi208964 | 100136at6656|PSEAE@208964@1|Q9I5U5|1 | 0.9971027 | 0.9971027 |
100136at6656 | ncbi418459 | 100136at6656|PUCGT@418459@1|E3KFA2|1 | 0.9895679 | 0.8232540 |
100136at6656 | ncbi10116 | 100136at6656|RAT@10116@1|G3V7R8|1 | 0.9996617 | 0.8541265 |
100136at6656 | ncbi284812 | 100136at6656|SCHPO@284812@1|Q9USU2|1 | 0.9994874 | 0.9994874 |
100136at6656 | ncbi35128 | 100136at6656|THAPS@35128@1|B8C2N6|1 | 0.9852370 | 0.7002961 |
100136at6656 | ncbi7070 | 100136at6656|TRICA@7070@1|D6X457|1 | 1.0000000 | 1.0000000 |
100136at6656 | ncbi237631 | 100136at6656|USTMA@237631@1|A0A0D1C927|1 | 0.9912998 | 0.6172244 |
100136at6656 | ncbi559292 | 100136at6656|YEAST@559292@1|P41819|1 | 0.9978912 | 0.9978912 |
The WIKI accompanying PhyloProfile gives a comprehensive guide of how to format input data.
Together with several functions for exploring phylogenetic profiles, we provide an interactive visualization application implemented with Shiny (https://CRAN.R-project.org/package=shiny).