diffuStats 1.10.0

`diffuStats`

is an R package providing several scores
for diffusion in networks.
While its original purpose lies on biological networks,
its usage is not limited to that scope.
In general terms, `diffuStats`

builds several propagation algorithms
on the package (Csardi and Nepusz 2006) classes and methods.
A more detailed analysis and documentation of the implemented
methods can be found in the protein function prediction vignette.

To get started, we will load a toy graph included in the package.

```
library(diffuStats)
data("graph_toy")
```

Letâ€™s take a look in the graph:

`graph_toy`

```
## IGRAPH 9a7b9df UN-- 48 82 -- Lattice graph
## + attr: name (g/c), dimvector (g/n), nei (g/n), mutual (g/l), circular
## | (g/l), layout (g/n), asp (g/n), input_vec (g/n), input_mat (g/n),
## | output_vec (g/n), output_mat (g/n), input_list (g/x), name (v/c),
## | class (v/c), color (v/c), shape (v/c), frame.color (v/c), label.color
## | (v/c), size (v/n)
## + edges from 9a7b9df (vertex names):
## [1] A1 --A2 A1 --A9 A2 --A3 A2 --A10 A3 --A4 A3 --A11 A4 --A5 A4 --A12
## [9] A5 --A6 A5 --A13 A6 --A7 A6 --A14 A7 --A8 A7 --A15 A8 --A16 A9 --A10
## [17] A9 --A17 A10--A11 A10--A18 A11--A12 A11--A19 A12--A13 A12--A20 A13--A14
## [25] A13--A21 A14--A15 A14--A22 A15--A16 A15--A23 A16--A24 A17--A18 A17--A25
## + ... omitted several edges
```

`plot(graph_toy)`

In the next section, we will be running diffusion algorithms on this tiny lattice graph.

The package `diffuStats`

is flexible and allows
several inputs at once for a given network.
The input format is, in its most general form,
a list of matrices, where each matrix contains
measured nodes in rows and specific scores in columns.
**Differents sets of scores may have different backgrounds**,
meaning that we can specifically tag sets of nodes as **unlabelled**.
If we dispose of a unique list of nodes for label propagation,
we should provide a list with a unique column vector
that contains `1`

â€™s in the labels in the list and `0`

â€™s otherwise.

In this example data, the graph contains one input already.

```
input_vec <- graph_toy$input_vec
head(input_vec, 15)
```

```
## A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15
## 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
```

Letâ€™s check how many nodes have values

`length(input_vec)`

`## [1] 48`

We see that all the nodes have a measure in each of the four score sets. In practice, these score sets could be disease genes, pathways, et cetera.

Each one of these columns in the input can be *smoothed* using the network
and new value will be derived - unlabelled nodes are also scored.
This is the main purpose of diffusion: to derive new scores that
intend to keep the same trends as the scores in the input,
but taking into account the network structure.
Equivalently, this can be regarded as a label propagation where
positive and negative examples propagate their labels to their
neighbouring nodes.

Letâ€™s start with the simplest case of diffusion:
only a vector of values is to be smoothed.
Note that these
**values must be named and must be a subset or all of the graph nodes**.

```
output_vec <- diffuStats::diffuse(
graph = graph_toy,
method = "raw",
scores = input_vec)
head(output_vec, 15)
```

```
## A1 A2 A3 A4 A5 A6 A7
## 0.03718927 0.04628679 0.04718643 0.06099494 0.09567369 0.04866964 0.02124098
## A8 A9 A10 A11 A12 A13 A14
## 0.01081382 0.06528103 0.10077145 0.08146401 0.10111963 0.27303017 0.07776389
## A15
## 0.02548044
```

The best way to visualise the scores is overlaying
them in the original lattice.
`diffuStats`

also comes with basic mapping functions
for graphical purposes.
Letâ€™s see an example:

```
igraph::plot.igraph(
graph_toy,
vertex.color = diffuStats::scores2colours(output_vec),
vertex.shape = diffuStats::scores2shapes(input_vec),
main = "Diffusion scores in our lattice"
)
```