Changes in version 1.28
o Added a restricted= option to quickSubCluster() to enable
subclustering on specific clusters.
Changes in version 1.20.0
o All deprecated functions from the previous release are now
defunct.
o Added a simplify= option to quickSubCluster() to get the
cluster assignments directly.
o Deprecated combinePValues() as this is replaced by
metapod::combineParallelPValues().
o getClusteredPCs() now uses bluster::clusterRows() by default.
o decideTestsPerLabel() now automatically detects pval.field= if
not supplied.
o Added the clusterCells() wrapper around bluster functionality.
o Removed the option to pass a matrix in design= from
pseudoBulkDGE().
o Migrated all normalization-related functions
(computeSumFactors(), calculateSumFactors(), cleanSizeFactors()
and computeSpikeFactors()) to a better home in scuttle.
Soft-deprecated existing functions.
o Modified getTopHVGs() to accept a SingleCellExperiment and
compute the DataFrame with modelGeneVar().
o Added fixedPCA() to compute a PCA with a fixed number of
components, a la scater::runPCA() (but without requiring
scater).
o Modified denoisePCA() so that it now complains if subset.row=
is not provided.
o Modified all pairwise* functions so that the p-value from
direction="any" is derived from the two p-values from the
one-sided tests. This is necessary for correctness with all
choices of lfc= and block=, at the cost of conservativeness
when block=NULL and lfc is large.
Changes in version 1.18.0
o Deprecated coassignProbs() as this is replaced by
bluster::pairwiseRand()
o Deprecated boostrapCluster() as this is replaced by
bluster::bootstrapStability().
o Deprecated gene.names= in the various pairwise* functions as
being out of scope.
o Added the testLinearModel() function to obtain inferences from
a linear model.
o Modified pseudoBulkDGE() to use formulas/functions in the
design= argument. Allow contrast= to be a character vector to
be run through makeContrasts().
o Added the pseudoBulkSpecific() function to test for
semi-label-specific DEGs in pseudo-bulk analyses.
o Added the summaryMarkerStats() function to compute some basic
summary statistics for marker filtering.
o Modified row.data= in findMarkers() to support list inputs.
Added a add.summary= option to easily include summary
information.
o Modified combineVar() and combineCV2() to support list inputs.
o Deprecated doubletCells() as this is replaced by
scDblFinder::computeDoubletDensity().
o Deprecated doubletCluster() as this is replaced by
scDblFinder::findDoubletClusters().
o Deprecated doubletRecovery() as this is replaced by
scDblFinder::recoverDoublets().
o Added sparse-optimized variance calculations to modelGeneVar(),
modelGeneCV2() and related functions, which may result in
slight changes to the results due to numeric precision.
o Exported combineBlocks() to assist combining of block-wise
statistics in other packages.
o Added lowess= and density.weights= options to fitTrendVar() to
rescue overfitted curves.
o Raised an error in denoisePCA() upon mismatches in the matrix
and technical statistics.
Changes in version 1.16.0
o Added the quickSubCluster() function for convenient
subclustering.
o Added the bootstrapCluster() function to convenient
bootstrapping of cluster stability.
o Added the coassignProb() function to compute coassignment
probabilities of alternative groupings.
o combineMarkers() and findMarkers() report a summary effect size
for each cluster.
o Added the multiMarkerStats() function to combine statistics
from multiple findMarkers() calls.
o Added the clusterPurity() function to evaluate cluster purity
as a quality measure.
o Added the pseudoBulkDGE() function to easily and safely perform
pseudo-bulk DE analyses. Also added the decideTestsPerLabel()
and summarizeTestsPerLabel() utilities.
o Added the clusterSNNGraph() and clusterKNNGraph() wrapper
functions for easier graph-based clustering. Provided a k-means
pre-clustering option to handle large datasets.
Changes in version 1.14.0
o Removed deprecated approximate= and pc.approx= arguments.
o Removed deprecated batch correction functions.
o Added option to pairwiseTTests() for standardization of
log-fold changes.
o Changed default BSPARAM= to bsparam() in quickCluster(),
denoisePCA(), doubletCells() and build*NNGraph().
o Added the pairwiseBinom() function for pairwise binomial tests
of gene expression.
o Renamed output fields of pairwiseWilcox() to use AUC for less
confusion. Added the lfc= argument to test against a log-fold
change.
o Added the fitTrendVar(), fitTrendCV2(), modelGeneVar(),
modelGeneVarWithSpikes(), modelGeneCV2(), modelCV2WithSpikes(),
fitTrendPoisson() and modelGeneVarByPoisson() functions to
model variability.
o Deprecated the trendVar(), technicalCV2(), improvedCV2(),
decomposeVar(), trendVar(), testVar(), makeTechTrend(),
multiBlockVar() and multiBlockNorm() functions.
o Modified combineVar() to not weight by residual d.f. unless
specifically instructed.
o Added the combineCV2() function to combine separate CV2
modelling results.
o Added the test.type= argument in findMarkers() to switch
between pairwise DE tests. Added the row.data= argument to
easily include row metadata in reordered tables. Deprecated
overlapExprs(), which is replaced by type="wilcox" in
findMarkers().
o Added the getTopMarkers() function to easily retrieve marker
lists from pairwise DE results.
o Added the getTopHVGs() function to easily retrieve HVG sets
from variance modelling results.
o In all functions that accept a block= argument, any level of
the blocking factor that cannot yield a result (e.g., due to
insufficient degrees of freedom) will now be completely ignored
and not contribute to any statistic.
o Added the getDenoisedPCs() function for general-purpose
PCA-based denoising on non-SingleCellExperiment inputs.
Converted denoisePCA() to a normal function, removed the method
for ANY matrix. Dropped max.rank= default to 50 for greater
speed in most cases.
o Added the calculateSumFactors() function for general-purpose
calculation of deconvolution factors on
non-SingleCellExperiment inputs. Converted computeSumFactors()
to a normal function, removed the method for ANY input.
Auto-guess min.mean= based on the average library size.
o Deprecated all special handling of spike-in rows, which are no
longer necessary when spike-ins are stored as alternative
experiments.
o Deprecated general.use= in computeSpikeFactors(), which is no
longer necessary when spike-ins are stored as alternative
experiments.
o Deprecated parallelPCA(), which has been moved to the PCAtools
package.
o Modified clusterModularity() to return upper-triangular
matrices, fixing a bug where the off-diagonal weights were
split into two entries across the diagonal. Added the as.ratio=
argument to return a matrix of log-ratios. Renamed the
get.values= argument to get.weights=.
o Simplified density calculation in doubletCells() for greater
robustness.
o Added a method="holm-middle" option to combinePValues(), to
test if most individual nulls are true. Added a min.prop=
option to control the definition of "most".
o Added a pval.type="some" option to combineMarkers(), as a
compromise between the two other modes. Added a min.prop=
option to tune stringency for pval.type="some" and "any".
o Added the getClusteredPCs() function to provide a cluster-based
heuristic for choosing the number of PCs.
o Added the neighborsTo*NNGraph() functions to generate (shared)
nearest neighbor graphs from pre-computed NN results.
o Switched to using only the top 10% of HVGs for the internal PCA
in quickCluster().
Changes in version 1.12.0
o Added option in quickCluster() to cluster on log-expression
profiles. Modified defaults to use graph-based clustering on
log-expression-derived PCs.
o Modified default choice of ref.clust= in computeSumFactors().
Degrade quietly to library size factors when cluster is too
small for all pool sizes.
o Minor change to cyclone() random number generation for
consistency upon parallelization.
o Added BPPARAM= to correlateNull() for parallelization. Minor
change in random number generation for consistency upon
parallelization.
o Minor change to parallelPCA() random number generation for
consistency upon parallelization.
o Created correlateGenes() function to compute per-gene
correlation statistics.
o Modified correlatePairs() to compute expected rho after all
possible tie-breaking permutations. Deprecated cache.size= as
all ranks are now returned as in-memory representations.
Deprecated per.gene= in favour of an external call to
correlateGenes(). Deprecated tol= as ties are now directly
determined by rowRanks().
o Switched to BiocSingular for PCA calculations across all
functions. Deprecated approximate= and pc.approx= arguments in
favour of BSPARAM=.
o Deprecated all batch correction functions to prepare for the
migration to batchelor.
Changes in version 1.10.0
o Removed selectorPlot(), exploreData() functions in favour of
iSEE.
o Fixed underflow problem in mnnCorrect() when dealing with the
Gaussian kernel. Dropped the default sigma= in mnnCorrect() for
better default performance.
o Supported parallelized block-wise processing in quickCluster().
Deprecated max.size= in favour of max.cluster.size= in
computeSumFactors(). Deprecated get.ranks= in favour of
scaledColRanks().
o Added max.cluster.size= argument to computeSumFactors().
Supported parallelized cluster-wise processing. Increased all
pool sizes to avoid rare failures if number of cells is a
multiple of 5. Minor improvement to how mean filtering is done
for rescaling across clusters in computeSumFactors(). Throw
errors upon min.mean=NULL, which used to be valid. Switched
positive=TRUE behaviour to use cleanSizeFactors().
o Added simpleSumFactors() as a simplified alternative to
quickCluster() and computeSumFactors().
o Added the scaledColRanks() function for computing scaled and
centred column ranks.
o Supported parallelized gene-wise processing in trendVar() and
decomposeVar(). Support direct use of a factor in design= for
efficiency.
o Added doubletCluster() to detect clusters that consist of
doublets of other clusters.
o Added doubletCells() to detect cells that are doublets of other
cells via simulations.
o Deprecated rand.seed= in buildSNNGraph() in favour of explicit
set.seed() call. Added type= argument for weighting edges based
on the number of shared neighbors.
o Deprecated rand.seed= in buildKNNGraph().
o Added multiBlockNorm() function for spike-abundance-preserving
normalization prior to multi-block variance modelling.
o Added multiBatchNorm() function for consistent downscaling
across batches prior to batch correction.
o Added cleanSizeFactors() to coerce non-positive size factors to
positive values based on number of detected genes.
o Added the fastMNN() function to provide a faster, more stable
alternative for MNN correction.
o Added BPPARAM= option for parallelized execution in
makeTechTrend(). Added approx.npts= option for
interpolation-based approximation for many cells.
o Added pairwiseTTests() for direct calculation of pairwise
t-statistics between groups.
o Added pairwiseWilcox() for direct calculation of pairwise
Wilcoxon rank sum tests between groups.
o Added combineMarkers() to consolidate arbitrary pairwise
comparisons into a marker list.
o Bugfixes to uses of block=, lfc= and design= arguments in
findMarkers(). Refactored to use pairwiseTTests() and
combineMarkers() internally. Added BPPARAM= option for
parallelized execution.
o Refactored overlapExprs() to sort by p-value based on
pairwiseWilcox() and combineMarkers(). Removed design= argument
as it is not compatible with p-value calculations.
o Bugfixes to the use of Stouffer's Z method in combineVar().
o Added combinePValues() as a centralized internal function to
combine p-values.
Changes in version 1.8.0
o Modified decomposeVar() to return statistics (but not p-values)
for spike-ins when get.spikes=NA. Added block= argument for
mean/variance calculations within each level of a blocking
factor, followed by reporting of weighted averages (using
Fisher's method for p-values). Automatically record global
statistics in the metadata of the output for use in
combineVar(). Switched output to a DataFrame object for
consistency with other functions.
o Fixed testVar() to report a p-value of 1 when both the observed
and null variances are zero.
o Allowed passing of arguments to irlba() in denoisePCA() to
assist convergence. Reported low-rank approximations for all
genes, regardless of whether they were used in the SVD.
Deprecated design= argument in favour of manual external
correction of confounding effects. Supported use of a vector or
DataFrame in technical= instead of a function.
o Allowed passing of arguments to prcomp_irlba() in
buildSNNGraph() to assist convergence. Allowed passing of
arguments to get.knn(), switched default algorithm back to a
kd-tree.
o Added the buildKNNGraph() function to construct a simple
k-nearest-neighbours graph.
o Fixed a number of bugs in mnnCorrect(), migrated code to C++
and parallelized functions. Added variance shift adjustment,
calculation of angles with the biological subspace.
o Modified trend specification arguments in trendVar() for
greater flexibility. Switched from ns() to robustSmoothSpline()
to avoid bugs with unloaded predict.ns(). Added block= argument
for mean/variance calculations within each level of a blocking
factor.
o Added option to avoid normalization in the SingleCellExperiment
method for improvedCV2(). Switched from ns() to smooth.spline()
or robustSmoothSpline() to avoid bugs.
o Replaced zoo functions with runmed() for calculating the median
trend in DM().
o Added block= argument to correlatePairs() to calculate
correlations within each level of a blocking factor. Deprecated
the use of residuals=FALSE for one-way layouts in design=.
Preserve input order of paired genes in the gene1/gene2 output
when pairings!=NULL.
o Added block= argument to overlapExprs() to calculate overlaps
within each level of a blocking factor. Deprecated the use of
residuals=FALSE for one-way layouts in design=. Switched to
automatic ranking of genes based on ability to discriminate
between groups. Added rank.type= and direction= arguments to
control ranking of genes.
o Modified combineVar() so that it is aware of the global stats
recorded in decomposeVar(). Absence of global statistics in the
input DataFrames now results in an error. Added option to
method= to use Stouffer's method with residual d.f.-weighted
Z-scores. Added weighted= argument to allow weighting to be
turned off for equal batch representation.
o Modified the behaviour of min.mean= in computeSumFactors() when
clusters!=NULL. Abundance filtering is now performed within
each cluster and for pairs of clusters, rather than globally.
o Switched to pairwise t-tests in findMarkers(), rather than
fitting a global linear model. Added block= argument for
within-block t-tests, the results of which are combined across
blocks via Stouffer's method. Added lfc= argument for testing
against a log-fold change threshold. Added log.p= argument to
return log-transformed p-values/FDRs. Removed empirical Bayes
shrinkage as well as the min.mean= argument.
o Added the makeTechTrend() function for generating a
mean-variance trend under Poisson technical noise.
o Added the multiBlockVar() function for convenient fitting of
multiple mean-variance trends per level of a blocking factor.
o Added the clusterModularity() function for assessing the
cluster-wise modularity after graph-based clustering.
o Added the parallelPCA() function for performing parallel
analysis to choose the number of PCs.
o Modified convertT() to return raw counts and size factors for
CellDataSet output.
o Deprecated exploreData(), selectorPlot() in favour of iSEE().
Changes in version 1.6.0
o Supported parallelization in buildSNNGraph(), overlapExprs()
with BPPARAM options.
o Forced zero-derived residuals to a constant value in
correlatePairs(), overlapExprs().
o Allowed findMarkers() to return IUT p-values, to identify
uniquely expressed genes in each cluster. Added options to
specify the direction of the log-fold changes, to focus on
upregulated genes in each cluster.
o Fixed bug in correlatePairs() when per.gene=TRUE and no
spike-ins are available. Added block.size argument to control
caching.
o Switched all C++ code to use the beachmat API. Modified several
functions to accept ANY matrix-like object, rather than only
base matrix objects.
o quickCluster() with method="igraph" will now merge based on
modularity to satisfy min.size requirements. Added max.size
option to restrict the size of the output clusters.
o Updated the trendVar() interface with parametric, method
arguments. Deprecated the trend="semiloess" option in favour of
parametric=TRUE and method="loess". Modified the NLS equation
to guarantee non-negative coefficients of the parametric trend.
Slightly modified the estimation of NLS starting parameters.
Second d.f. of the fitted F-distribution is now reported as df2
in the output.
o Modified decomposeVar() to automatically use the second d.f.
when test="f".
o Added option in denoisePCA() to return the number of components
or the low-rank approximation. The proportion of variance
explained is also stored as an attribute in all return values.
o Fixed a variety of bugs in mnnCorrect().
Changes in version 1.4.0
o
Switched default BPPARAM to SerialParam() in all functions.
o
Added run argument to selectorPlot(). Bug fix to avoid adding
an empty list.
o
Added exploreData() function for visualization of scRNA-seq
data.
o
Minor bug fix to DM() when extrapolation is required.
o
Added check for centred size factors in trendVar(),
decomposeVar() methods. Refactored trendVar() to include
automatic start point estimation, location rescaling and df2
estimation.
o
Moved spike-in specification to the scater package.
o
Deprecated isSpike<- to avoid confusion over input/output
types.
o
Generalized sandbag(), cyclone() to work for other
classification problems.
o
Added test="f" option in testVar() to account for additional
scatter.
o
Added per.gene=FALSE option in correlatePairs(), expanded
accepted value types for subset.row. Fixed an integer overflow
in correlatePairs(). Also added information on whether the
permutation p-value reaches its lower bound.
o
Added the combineVar() function to combine results from
separate decomposeVar() calls.
o
Added protection against all-zero rows in technicalCV2().
o
Added the improvedCV2() function as a more stable alternative
to technicalCV2().
o
Added the denoisePCA() function to remove technical noise via
selection of early principal components.
o
Removed warning requiring at least twice the max size in
computeSumFactors(). Elaborated on the circumstances
surrounding negative size factors. Increased the default number
of window sizes to be examined. Refactored C++ code for
increased speed.
o
Allowed quickCluster() to return a matrix of ranks for use in
other clustering methods. Added method="igraph" option to
perform graph-based clustering for large numbers of cells.
o
Added the findMarkers() function to automatically identify
potential markers for cell clusters.
o
Added the overlapExprs() function to compute the overlap in
expression distributions between groups.
o
Added the buildSNNGraph() function to build a SNN graph for
cells from their expression profiles.
o
Added the correctMNN() function to perform batch correction
based on mutual nearest neighbors.
o
Streamlined examples when mocking up data sets.
Changes in version 1.2.0
o
Transformed correlations to a metric distance in
quickCluster().
o
Removed normalize() in favour of scater's normalize().
o
Switched isSpike()<- to accept a character vector rather than a
logical vector, to enforce naming of spike-in sets. Also added
warning code when the specified spike-in sets overlap.
o
Allowed compute*Factors() functions to directly return the size
factors.
o
Added selectorPlot() function for interactive plotting.
o
Switched to a group-based weighted correlation for one-way
layouts in correlatePairs() and correlateNull(), and to a
correlation of residuals for more complex design matrices.
o
Added phase assignments to the cyclone() output.
o
Implemented Brennecke et al.'s method in the technicalCV2()
function.
o
Updated convertTo() to store spike-in-specific size factors as
offsets.
o
Moved code and subsetting into C++ to improve memory
efficiency.
o
Switched to loess-based trend fitting as the default in
trendVar(), replaced polynomial with semi-loess fitting.
o
Added significance statistics to output of decomposeVar(), with
only the p-values replaced by NAs for spike-ins.
o
Updated documentation and tests.
Changes in version 1.0.0
o
New package scran, for low-level analyses of single-cell RNA
sequencing data.