1 Introduction

In this vignette we assume that the reader is already familiarized with the other vignette. But want to know how can it help to answer other questions in other situations

We follow the same convention about names of the objects used genesReact and genesKegg are the list with information about the pathways the genes are involved in.

2 Merging similarities

If one calculates similarities with KEGG data and Reactome or other input for the same genes or clusters BioCor provides a couple of functions to merge them.

We can set a weight to each similarity input with weighted.sum, multiply them also using a weight for each similarity (with weighted.prod), doing the mean or just adding them up. Similarities allow us to apply a function to combine the matrices of a list. Here we use some of the genes used in the first vignette:

kegg <- mgeneSim(c("672", "675", "10"), genesKegg)
react <- mgeneSim(c("672", "675", "10"), genesReact)
## We can sum it adding a weight to each origin
weighted.sum(c(kegg["672", "675"], react["672","675"]), w = c(0.3, 0.7))
## [1] 0.7247289

## Or if we want to perform for all the matrix
## A list of matrices to merge
sim <- list("kegg" = kegg, "react" = react)
similarities(sim, weighted.sum, w = c(0.3, 0.7))
##           672        675         10
## 672 1.0000000 0.72472885 0.09903930
## 675 0.7247289 1.00000000 0.05156284
## 10  0.0990393 0.05156284 1.00000000
similarities(sim, weighted.prod, w = c(0.3, 0.7))
##           672          675           10
## 672 0.2100000 0.0173101952 0.0000000000
## 675 0.0173102 0.2100000000 0.0001213771
## 10  0.0000000 0.0001213771 0.2100000000
similarities(sim, prod)
##           672          675           10
## 672 1.0000000 0.0824295011 0.0000000000
## 675 0.0824295 1.0000000000 0.0005779864
## 10  0.0000000 0.0005779864 1.0000000000
similarities(sim, mean)
##            672        675         10
## 672 1.00000000 0.54121475 0.07074236
## 675 0.54121475 1.00000000 0.03918538
## 10  0.07074236 0.03918538 1.00000000

This functions are similar to weighted.mean, except that first the multiplication by the weights is done and then the NAs are removed:

weighted.mean(c(1, NA), w = c(0.5, 0.5), na.rm = TRUE)
## [1] 1
weighted.mean(c(1, 0.5, NA), w = c(0.5, 0.25, 0.25), na.rm = TRUE)
## [1] 0.8333333
weighted.sum(c(1, NA), w = c(0.5, 0.5))
## [1] 0.5
weighted.sum(c(1, 0.5, NA), w = c(0.5, 0.25, 0.25))
## [1] 0.625
weighted.prod(c(1, NA), w = c(0.5, 0.5))
## [1] 0.5
weighted.prod(c(1, 0.5, NA), w = c(0.5, 0.25, 0.25))
## [1] 0.0625

3 Assesing a differential study

In this example we will use the functional similarities in a classical differential study.

3.1 Obtaining data

We start using data from the RNAseq workflow and following the analysis:

suppressPackageStartupMessages(library("airway"))
data("airway")
library("DESeq2")

dds <- DESeqDataSet(airway, design = ~ cell + dex)
dds$dex <- relevel(dds$dex, "untrt")
dds <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
res <- results(dds, alpha = 0.05)
summary(res)
## 
## out of 33469 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)     : 2211, 6.6% 
## LFC < 0 (down)   : 1817, 5.4% 
## outliers [1]     : 0, 0% 
## low counts [2]   : 16676, 50% 
## (mean count < 7)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
plot(res$log2FoldChange, -log10(res$padj), pch = 16, xlab = "log2FC", 
     ylab = "-log10(p.ajd)", main = "Untreated vs treated")
logFC <- 2.5
abline(v=c(-logFC, logFC), h = -log10(0.05), col = "red")