preciseTAD

DOI: 10.18129/B9.bioc.preciseTAD    

preciseTAD: A machine learning framework for precise TAD boundary prediction

Bioconductor version: Release (3.12)

preciseTAD provides functions to predict the location of boundaries of topologically associated domains (TADs) and chromatin loops at base-level resolution. As an input, it takes BED-formatted genomic coordinates of domain boundaries detected from low-resolution Hi-C data, and coordinates of high-resolution genomic annotations from ENCODE or other consortia. preciseTAD employs several feature engineering strategies and resampling techniques to address class imbalance, and trains an optimized random forest model for predicting low-resolution domain boundaries. Translated on a base-level, preciseTAD predicts the probability for each base to be a boundary. Density-based clustering and scalable partitioning techniques are used to detect precise boundary regions and summit points. Compared with low-resolution boundaries, preciseTAD boundaries are highly enriched for CTCF, RAD21, SMC3, and ZNF143 signal and more conserved across cell lines. The pre-trained model can accurately predict boundaries in another cell line using CTCF, RAD21, SMC3, and ZNF143 annotation data for this cell line.

Author: Spiro Stilianoudakis [aut, cre], Mikhail Dozmorov [aut]

Maintainer: Spiro Stilianoudakis <stilianoudasc at vcu.edu>

Citation (from within R, enter citation("preciseTAD")):

Installation

To install this package, start R (version "4.0") and enter:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("preciseTAD")

For older versions of R, please refer to the appropriate Bioconductor release.

Documentation

To view documentation for the version of this package installed in your system, start R and enter:

browseVignettes("preciseTAD")

 

HTML R Script preciseTAD
PDF   Reference Manual
Text   NEWS
Text   LICENSE

Details

biocViews Classification, Clustering, FeatureExtraction, FunctionalGenomics, HiC, Sequencing, Software
Version 1.0.0
In Bioconductor since BioC 3.12 (R-4.0) (< 6 months)
License MIT + file LICENSE
Depends R (>= 4.0.0)
Imports S4Vectors, IRanges, GenomicRanges, randomForest, ModelMetrics, e1071, PRROC, pROC, caret, DMwR, utils, cluster, dbscan, doSNOW, foreach, pbapply, stats, parallel
LinkingTo
Suggests knitr, rmarkdown, testthat, BiocCheck, BiocManager, BiocStyle
SystemRequirements
Enhances
URL https://github.com/dozmorovlab/preciseTAD
BugReports https://github.com/dozmorovlab/preciseTAD/issues
Depends On Me
Imports Me
Suggests Me
Links To Me
Build Report  

Package Archives

Follow Installation instructions to use this package in your R session.

Source Package preciseTAD_1.0.0.tar.gz
Windows Binary preciseTAD_1.0.0.zip
macOS 10.13 (High Sierra) preciseTAD_1.0.0.tgz
Source Repository git clone https://git.bioconductor.org/packages/preciseTAD
Source Repository (Developer Access) git clone git@git.bioconductor.org:packages/preciseTAD
Package Short Url https://bioconductor.org/packages/preciseTAD/
Package Downloads Report Download Stats

Documentation »

Bioconductor

R / CRAN packages and documentation

Support »

Please read the posting guide. Post questions about Bioconductor to one of the following locations: