### HMM method: `hmm`

The first step in HMM method is quantifying the evidence for differential
expression at the base-resolution level. To do this, `srnadiff`

use
the common approach in comparative analysis of transcriptomics data: test
the null hypothesis that the logarithmic fold change between condition groups
for a nucleotide expression is exactly zero.

The next step in the HMM approach enforces a smoothness assumption over the
state of nucleotides: differential expression does not randomly switch along
the chromosome, rather, continuous regions of RNA are either “differentially
expressed” or “not”. This is captured with a hidden Markov model (HMM) with
binary latent state corresponding to the true state of each nucleotide:
differentially expressed or not differentially expressed.

The observations of the HMM are then the empirical p-values arising from the
differential expression analysis corresponding to each nucleotide position.
Modelling p-values directly enabled us to define the emission of each state
as follows: the differentially expressed state emits a p-value \(< t\) with
probability \(p\), and the not differentially expressed state emits a p-value
\(\geqslant t\) with probability \(1-p\), where \(t\) is a real number between 0
and 1.

The HMM approach normally needs emission, transition, and starting
probabilities values. They can be tuned by the user according to the
overall p-values from differential analysis. We then run the Viterbi
algorithm [ref] in order to finding the most likely sequence of states
from the HMM. This essentially segments the genome into regions, where
a region is defined as a set of consecutive bases showing a common expression
signature. A region of bases with differentially expressed state is referred
as an expressed region and is given as output of the method.

To run the HMM approach, `srnadiff`

first form a large matrix, with
rows corresponding to bases, columns corresponding to samples and entries
are the coverage from a nucleotide of a particular sample. This count matrix
is then analyzed as into feature-level counts using the feature-level RNA-seq
differential expression analysis from *DESeq2*. In practice, the
p-value is not computed for every nucleotide. Nucleotides for which the sum
of the coverage across all samples is less than a threshold are given a
p-value of 1, because these poorly expressed bases are unlikely to provide a
differentially expressed sRNA.

The parameters for the HMM method are:

`noDiffToDiff`

: Initial transition probability from
“no differentially expressed” state to “differentially expressed”
state.

`diffToNoDiff`

: Initial transition probability from
“differentially expressed” state to no “differentially expressed”
state.

`emission`

: Is the probability to emit a p-value \(<t\) in
the “differentially expressed” state, and a p-value \(\geq t\) in the
“not differentially expressed” state.

`emissionThreshold`

: Is the threshold \(t\) that limits each
state.

This parameters can be changed using using the assignment function
`parameters<-`

`parameters(srnaExp) <- list(noDiffToDiff=0.01, emissionThreshold=0.2)`

### IR method: `IR`

In this approach, for each base, the average from the normalized coverage is
calculated across all samples into each condition. This generates a vector of
(normalized) mean coverage expression per condition. These two vectors are
then used to compute per-nucleotide log-ratios (in absolute value) across the
genome. For the computed log-ratio expression, the method uses a sliding
threshold *h* that run across the log-ratio levels identifying bases with
log-ratio value above of *h*.
Regions of contiguous bases passing this threshold are then analyzed using an
adaptation of Aumann and Lindell algorithm for irreducibility property
(Aumann and Lindell 2003).

The minimun sliding threshold, `minLogFC`

, used in the IR method can
be changed using the assignment function `parameters<-`

`parameters(srnaExp) <- list(minLogFC=1)`

### Naive method: `naive`

This method is the simplest, gived a fixed threshold *h*, contiguous
bases with log-ratio expression (in absolute value) passing this threshold
are then considered as candidate differentially expressed regions.

The fixed threshold, `cutoff`

, used in this method can be changed using
the assignment function `parameters<-`

`parameters(srnaExp) <- list(cutoff=1.5)`