Contents

1 Overview

Recurrent patterns in biological networks may reflect critical roles in multiple biological processes (Bracken, Scott, and Goodall 2016), for example, regulatory loops between transcription factors and microRNAs (Zhang et al. 2015). RTNduals searches for regulatory patterns between pairs of regulators, using regulatory networks generated by the RTN package (for details, please refer to the RTN documentation) (Castro et al. 2016). In such a network, each regulator has an associated set of target genes (i.e. a regulon), and when we assess the shared targets between a pair of regulators, we find triplets that may be regulated in a positive or negative direction, whith regulators either cooperating or competing in the regulatory network. The inference of dual regulons requires three complementary statistics: (1) Targets are assigned to regulons based on mutual information (MI) between the regulator and the target. The significance of the MI statistics is assessed by permutation and bootstrap analysis. (2) Shared targets between any two regulons are identified and the similarity in regulation (i.e. positive or negative direction) is assessed by correlation analysis. (3) A test is carried out to determine if the number of shared targets is higher than expected by chance. The schematics in Figure 1 show two triplets formed between regulators. In (a) the two regulators co-operate by influencing shared targets in the same direction (co-activation or co-repression), while in (b) they compete, influencing targets in opposite directions. For gene expression data, typical regulators might include transcription factors, miRNAs, eRNAs and lncRNAs.

title Figure 1. Examples of regulators and predicted associations. This figure illustrates four triplets formed between regulators (a). Regulators R1 and R2 co-activate or co-repress shared targets. (b) Regulators R1 and R2 compete, influencing targets in opposite directions.

2 Quick Start

The RTNduals workflow starts with a preprocessing step that generates an MBR-class (Motifs Between Regulons) object from an expression matrix and a list of regulators. The expression matrix is typically obtained from multiple samples (e.g. transcriptomes from a cancer cohort), while the list of regulators represents some prior biological information indicating which genes in the expression matrix should be regarded as regulators. The input data can also deal with different classes of regulators; for example, genes and microRNAs. In this case, the expression matrix should comprise mRNA and miRNA expression values. Alternatively, the MBR-class object can be obtained from a TNI-class object pre-computed in the RTN package.

2.1 Load datasets

This example provides the data required to generate an MBR-class object. The dataset dt4rtn is available from the RTN package and consists of an R list with 6 objects, 3 of which will be used in the subsequent analysis: (1) gexp, a named gene expression matrix with 120 samples (genes in rows, samples in cols), (2) gexpIDs, a data.frame with Probe-to-ENTREZ annotation, and (3) tfs, a named vector listing 148 transcription factors. These datasets were extracted, pre-processed and size-reduced from Fletcher et al. (2013), and should be regarded as examples for demonstration purposes only.

##--- load package and dataset for demonstration
library(RTNduals)
data("dt4rtn", package = "RTN")
gexp <- dt4rtn$gexp
annot <- dt4rtn$gexpIDs
tfs <- dt4rtn$tfs[c("IRF8","IRF1","PRDM1","E2F3","STAT4","LMO4","ZNF552")]

2.2 Preprocessing

The gexp data matrix and the corresponding annotation are evaluated by the mbrPreprocess function in order to check the consistency of the input data. After this step it is generated a pre-processed MBR-class object whose status is updated to ‘Preprocess [x]’.

##--- generate a pre-processed MBR-class object
rmbr <- mbrPreprocess(gexp=gexp, regulatoryElements=tfs, rowAnnotation=annot)

2.3 Run permutation analysis

The mbrPermutation method inherits the same algorithm implemented in the RTN package. This function takes the pre-processed MBR-class object and returns a regulatory network inferred by mutual information analysis (with multiple hypothesis testing corrections). The results are included in the ‘TNI’ slot, which will be used in the subsequent steps of the pipeline.

##--- compute a regulatory network
##--- (set nPermutations>=1000)
rmbr <- mbrPermutation(rmbr, nPermutations=100, pValueCutoff=0.05)

2.4 Run bootstrap analysis

In additional to the permutation analysis, the stability of the regulatory network is assessed by bootstrapping using the mbrBootstrap function, which also inherits the same algorithm from the RTN package. The ‘TNI’ slot of the MBR-class object is updated with a consensus bootstrap network.

##--- check stability of the regulatory network
##--- (set nBootstrap>=100)
rmbr <- mbrBootstrap(rmbr, nBootstrap=10)

2.5 Run DPI filter

In a given regulatory network each target can be linked to multiple regulators as a result of both direct and indirect interactions. The Data Processing Inequality (DPI) algorithm (Meyer, Lafitte, and Bontempi 2008) is used to remove the weakest interaction between two regulators and a common target. This step also inherits the algorithm that is implemented in the RTN package.

##---apply DPI algorithm
rmbr <- mbrDpiFilter(rmbr)

2.6 Run association analysis between regulons

The mbrAssociation method takes the transcriptional network computed in the previous steps and enumerates all triplets formed by two regulatores and one shared target. The method retrieves the mutual information between regulators and assesses the agreement between the predicted downstream effects using correlation analysis. A Fisher’s exact test is used to evaluate whether the number of shared targets is greater than expected by chance.

##--- test associations for dual regulons
rmbr <- mbrAssociation(rmbr)

A summary of the results can be accessed from ‘rmbr’ using the mbrGet function.

##--- check summary
mbrGet(rmbr, what="summary")
## $MBR
## $MBR$Duals
##       Tested Predicted
## Duals     21        11
## 
## 
## $TNI
## $TNI$tnet
##          Regulators Targets Edges
## tnet.ref          7    2345  6931
## tnet.dpi          7    2345  3009
## 
## $TNI$regulonSize
##          Min. 1st Qu. Median     Mean 3rd Qu. Max.
## tnet.ref  823     940    991 990.1429  1054.5 1128
## tnet.dpi  255     328    389 429.8571   530.5  648
##--- get results
mbrGet(rmbr, what="dualsOverlap")
##              Regulon1 Regulon2 Universe.Size Regulon1.Size Regulon2.Size
## IRF8~PRDM1       IRF8    PRDM1          1227           371           389
## IRF1~STAT4       IRF1    STAT4          1072           285           255
## IRF8~STAT4       IRF8    STAT4          1098           371           255
## PRDM1~STAT4     PRDM1    STAT4          1205           389           255
## IRF1~IRF8        IRF1     IRF8          1151           285           371
## IRF1~PRDM1       IRF1    PRDM1          1223           285           389
## STAT4~ZNF552    STAT4   ZNF552          1495           255           420
## IRF1~ZNF552      IRF1   ZNF552          1479           285           420
## IRF8~ZNF552      IRF8   ZNF552          1557           371           420
## E2F3~STAT4       E2F3    STAT4          1476           648           255
## PRDM1~ZNF552    PRDM1   ZNF552          1591           389           420
##              Expected.Overlap Observed.Overlap        Pvalue Adjusted.Pvalue
## IRF8~PRDM1          117.61940              281 1.836720e-104   3.857113e-103
## IRF1~STAT4           67.79384              191  2.645708e-82    5.555986e-81
## IRF8~STAT4           86.16120              211  6.425618e-78    1.349380e-76
## PRDM1~STAT4          82.31950              196  8.276578e-63    1.738081e-61
## IRF1~IRF8            91.86360              205  8.515594e-59    1.788275e-57
## IRF1~PRDM1           90.65004              169  1.731888e-28    3.636966e-27
## STAT4~ZNF552         71.63880              129  3.431281e-17    7.205689e-16
## IRF1~ZNF552          80.93306              132  4.200483e-13    8.821014e-12
## IRF8~ZNF552         100.07707              149  1.252985e-10    2.631268e-09
## E2F3~STAT4          111.95122              152  2.256028e-08    4.737660e-07
## PRDM1~ZNF552        102.69013              140  8.992028e-07    1.888326e-05
mbrGet(rmbr, what="dualsCorrelation")
##              Regulon1 Regulon2 MI.Regulators R.Regulons       Pvalue
## IRF8~STAT4       IRF8    STAT4    0.62896445  0.9192001 2.263731e-62
## IRF8~PRDM1       IRF8    PRDM1    0.31474525  0.7618021 7.173278e-46
## PRDM1~STAT4     PRDM1    STAT4    0.23222278  0.6853671 6.827543e-34
## IRF1~STAT4       IRF1    STAT4    0.31741440  0.6943596 5.813017e-33
## IRF1~IRF8        IRF1     IRF8    0.26890696  0.5302156 6.030964e-27
## IRF8~ZNF552      IRF8   ZNF552    0.05045733 -0.5835262 1.825772e-19
## IRF1~ZNF552      IRF1   ZNF552    0.04623752 -0.5833119 5.509123e-16
## STAT4~ZNF552    STAT4   ZNF552    0.07799377 -0.5362770 1.729396e-15
## E2F3~STAT4       E2F3    STAT4    0.07693758  0.4399875 8.853647e-13
## IRF1~PRDM1       IRF1    PRDM1    0.13397102  0.4010035 3.506285e-10
## PRDM1~ZNF552    PRDM1   ZNF552    0.06513500 -0.3890525 1.234646e-08
##              Adjusted.Pvalue
## IRF8~STAT4      4.753836e-61
## IRF8~PRDM1      1.506388e-44
## PRDM1~STAT4     1.433784e-32
## IRF1~STAT4      1.220734e-31
## IRF1~IRF8       1.266503e-25
## IRF8~ZNF552     3.834122e-18
## IRF1~ZNF552     1.156916e-14
## STAT4~ZNF552    3.631733e-14
## E2F3~STAT4      1.859266e-11
## IRF1~PRDM1      7.363199e-09
## PRDM1~ZNF552    2.592757e-07

Also, when prior evidences are available this method can add a ‘supplementaryTable’ regarding the association between regulators. The ‘supplementaryTable’ is a ‘data.frame’ listing unique relationships between any two regulators (please refer to the documentation for details on the input data format).

3 Session information

## R version 3.5.0 Patched (2018-05-03 r74699)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
## 
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.8-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.8-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] RTNduals_1.5.1  RTN_2.5.1       BiocStyle_2.9.3
## 
## loaded via a namespace (and not attached):
##  [1] zip_1.0.0           Rcpp_0.12.17        cellranger_1.1.0   
##  [4] compiler_3.5.0      pillar_1.2.3        forcats_0.3.0      
##  [7] tools_3.5.0         minet_3.39.0        digest_0.6.15      
## [10] evaluate_0.10.1     tibble_1.4.2        pkgconfig_2.0.1    
## [13] rlang_0.2.1         openxlsx_4.1.0      igraph_1.2.1       
## [16] curl_3.2            yaml_2.1.19         parallel_3.5.0     
## [19] haven_1.1.1         xfun_0.1            rio_0.5.10         
## [22] stringr_1.3.1       knitr_1.20          S4Vectors_0.19.12  
## [25] IRanges_2.15.14     stats4_3.5.0        rprojroot_1.3-2    
## [28] data.table_1.11.4   snow_0.4-2          readxl_1.1.0       
## [31] foreign_0.8-70      rmarkdown_1.10      bookdown_0.7       
## [34] carData_3.0-1       limma_3.37.1        RedeR_1.29.0       
## [37] car_3.0-0           magrittr_1.5        backports_1.1.2    
## [40] htmltools_0.3.6     BiocGenerics_0.27.0 abind_1.4-5        
## [43] stringi_1.2.3

References

Bracken, Cameron P., Hamish S. Scott, and Gregory J. Goodall. 2016. “A Network-Biology Perspective of microRNA Function and Dysfunction in Cancer.” Nat Rev Genet 17 (12). Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.:719–32. http://dx.doi.org/10.1038/nrg.2016.134.

Castro, Mauro, Ines de Santiago, Thomas Campbell, Courtney Vaughn, Theresa Hickey, Edith Ross, Wayne Tilley, Florian Markowetz, Bruce Ponder, and Kerstin Meyer. 2016. “Regulators of Genetic Risk of Breast Cancer Identified by Integrative Network Analysis.” Nature Genetics 48 (1):12–21. https://doi.org/10.1038/ng.3458.

Fletcher, Michael, Mauro Castro, Suet-Feung Chin, Oscar Rueda, Xin Wang, Carlos Caldas, Bruce Ponder, Florian Markowetz, and Kerstin Meyer. 2013. “Master Regulators of FGFR2 Signalling and Breast Cancer Risk.” Nature Communications 4:2464. https://doi.org/10.1038/ncomms3464.

Meyer, Patrick, Frederic Lafitte, and Gianluca Bontempi. 2008. “Minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information.” BMC Bioinformatics 9 (1):461. https://doi.org/10.1186/1471-2105-9-461.

Zhang, Hong-Mei, Shuzhen Kuang, Xushen Xiong, Tianliuyun Gao, Chenglin Liu, and An-Yuan Guo. 2015. “Transcription Factor and microRNA Co-Regulatory Loops: Important Regulatory Motifs in Biological Processes and Diseases.” Briefings in Bioinformatics 16 (1):45–58. https://doi.org/10.1093/bib/bbt085.