1 Introduction

As a SingleCellExperiment-derived package, MAST can easily be inserted into workflows with packages such as scran, scater, zinbwave, SCnorm and others. The main gotcha is packages that assume integer counts vs log-transformed, or log-transformed, approximately scale-normalized data. We find that MAST performs best with log-transformed, scale-normalized data that has been thresholded, such as \(\log_2(\text{transcripts per million} + 1)\).

We address this by:

  • testing for log-like data for objects constructed in MAST
  • explicitly naming the slot of the assay containing such putatively log-like data
  • by default operating on the slot with such log-like data

In objects that were constructed in other packages, we …

In what follows, we show an example of using scater to plot some QC metrics, SCnorm to normalize data, and, and conversion to a Seurat object.

2 From MAST to Scater

Scater (citation) is a package that …


knitr::opts_chunk$set(message = FALSE,error = FALSE,warning = FALSE)
data(maits, package='MAST')
unlog <- function(x) ceiling(2^x - 1)
sca_raw =  FromMatrix(t(maits$expressionmat), maits$cdat, maits$fdat)
## Assuming data assay in position 1, with name et is log-transformed.
assays(sca_raw)$counts = unlog(assay(sca_raw))

Here we make an object with assays counts and et. By default, MAST will operate on the et assay, but scran wants count-like data for some of its QC. The et data are log2 + 1 transcripts per million (TPM), as output by RSEM.

We could specify the assay name at creation with sca_raw = FromMatrix(list(logTPM = t(maits$expressionmat)), maits$cdat, maits$fdat) or rename an object that contains appropriately transformed data with assayNames(sca_raw) = c('logTPM', 'counts').

Before calling scater functionality, you might pause to consider if some features should belong in special control sets, such as mitochrondial genes, or spike-ins.

sca_raw = calculateQCMetrics(sca_raw)
plotRowData(sca_raw, x = 'log10_mean_counts', 'pct_dropout_by_counts')