Changes in version 4.3.5 (2024-07-14)
o
The new (edgeR v4) QL pipeline produces gene-specific residual
degrees of freedom (df.residual), which may be greater or less
than the nominal residual degrees of freedom, which are equal
to the number of observations minus the number of estimated
coefficients, i.e., `nrow(design) - ncol(design)`. glmQLFit()
sets very small values of df.residual to zero for the empirical
Bayes dispersion moderation. Previously any df.residual < 0.99
was floored in this way. Now the flooring cutoff is df.residual
< 0.5 if the nominal df.residual is 1 or df.residual < 0.99 if
the nominal df.residual is 2 or more.
o
New function catchRSEM() analogous to catchKallisto() and
catchSalmon().
o
Add checks for negative or NA counts to cpm, cpmByGroup,
normLibSizes.
Changes in version 4.2.0 (2024-04-28)
o
The new QL pipeline becomes the default for glmQLFit() by
setting `legacy=FALSE`.
o
Add cameraPR method for DGELRT objects.
o
New arguments `prior.n` and `adaptive.span` for voomLmFit().
o
New argument `robust` for diffSpliceDGE().
o
The NEWS.Rd file has been revised to include the date of each
version release and to include earlier versions of edgeR.
o
catchSalmon() now detects whether resamples are Gibbs or
bootstrap.
o
The catchSalmon help page now explains the columns of the
`annotation` output data.frame.
o
decideTestsDGE() deprecated in favor of decideTests().
Changes in version 4.0.0 (2023-10-25)
o
New statistical methods implemented in glmQLFit() to ensure
accurate estimation of the quasi-dispersion for data with small
counts. The new method computes adjusted residual deviances
with adjusted degrees of freedom to improve the chisquare
approximation to the residual deviance. The new methodology
includes the new argument 'top.proportion' for glmQLFit() to
specify the proportion of highly expressed genes used to
estimate the common NB dispersion used in the new method. The
output DGEGLM object contains new components `leverage`,
`unit.deviance.adj`, `unit.df.adj`, `deviance.adj`,
`df.residual.adj` and `working.dispersion`. The new method can
be turned on `legacy=FALSE`. By default, glmQLFit() will give
the same results as in previous releases of edgeR.
o
New argument 'covariate.trend' for glmQLFit() to allow a
user-specified covariate for the trended prior used to estimate
the quasi-dispersions.
o
The gene set testing functions roast(), mroast(), fry(),
camera() and romer() now have S3 methods for DGEGLM objects.
o
The edgeR Introductory vignette is converted from Sweave and
pdf to Rmd and html.
o
Revised help pages for filterByExp() and catchSalmon().
Changes in version 3.42.0 (2023-04-25)
o
New function Seurat2PB() for creating a pseudo-bulk DGEList
object from a Seurat object. New case study in User's Guide
illustrating its use.
o
New function normLibSizes() is now a synonym for
calcNormFactors().
o
Rename effectiveLibSizes() to getNormLibSizes().
o
DGEList() is now an S3 generic function with a method for
data.frames. The data.frame method allows users to specify
which columns contain gene annotation and which contain counts.
If the annotation columns are not specified, the function will
check for non-numeric columns and will attempt to set the
leading columns up to the last non-numeric column as
annotation. 'y' is now a compulsory argument for DGEList().
Previously it defaulted to a matrix with zero rows and zero
columns.
o
New case study in User's Guide on a transcript-level different
expression analysis.
o
The case study on alernative splicing in the User's Guide has
been replaced with a new data example.
Changes in version 3.40.0 (2022-11-02)
o
New argument 'hairpinBeforeBarcode' for processAmplicons(). The
revised function can process reads where the
hairpins/sgRNAs/sample index sequences are in variable
positions within each read. When 'plotPositions=TRUE' a density
plot of the match positions is created to allow the user to
assess whether they occur in the expected positions.
o
Update C++ BLAS calls to account for USE_FC_LEN_T setting in R
4.3.0.
o
Bug fix to R_compute_apl.cpp to make sure GLM working weights
are zero when fitted mu=0.
Changes in version 3.38.0 (2022-04-27)
o
New argument 'keep.EList' for voomLmFit() to store the
normalized log2-CPM values and voom weights.
Changes in version 3.36.0 (2021-10-27)
o
diffSpliceDGE() now returns p-value=1 instead of NA when an
exon has all zero counts.
o
Improve error message from readDGE() when there are repeated
gene/tag names.
Changes in version 3.34.0 (2021-05-20)
o
New function featureCounts2DGEList() that converts results from
Rsubread::featureCounts() to DGELists.
o
plotMDS.DGEList (the DGEList method of plotMDS) now displays
the percentage variance explained by each dimension, and a new
argument 'var.explained' is provided to make that optional. It
no longer calls stats::cmdscale() internally and the 'ndim'
argument is removed. The "bcv" method is scheduled to be
deprecated in a future release of edgeR.
o
read10X() now counts the number of comment lines in mtx files
and skips those lines when reading in the data.
o
Fix a bug in voomLmFit() whereby zeros were sometimes
incorrectly identified due to floating point errors.
Changes in version 3.32.0 (2020-10-28)
o
cpm.default() and rpkm.default() now accept offset.
o
scaleOffset() now accepts CompressedMatrix offset and accounts
for norm.factors.
o
Revise the lowess trend fitting in voomLmFit() to downweight
genes with exact zeros and hence fewer df to estimate the
variance.
o
Add as.data.frame method for DGEList class.
o
Change default choice for refColumn in calcNormFactors() with
method="TMMwsp". The new method chooses the column with the
largest sum of sqrt-counts.
o
processAmplicons() can now accommodate data from newer screens
that use a staggered primer design.
o
Fixed a bug that diffSpliceDGE() accept more than one coef. It
now gives a warning if more than one coef or contrast is
supplied. It only uses the first.
Changes in version 3.30.2 (2020-04-28)
o
New function voomLmFit() that combines the limma voom-lmFit
pipeline with loss of residual df due to zero counts as for
glmQLFit(). The new function is more robust to zero counts than
running voom() and lmFit() separately. The new function allows
sample quality weights and intra-block correlations to be
estimated it incorporates the functionality of
duplicateCorrelation() and voomWithQualityWeights() as well.
o
New function SE2DGEList() to convert a SummarizedExperiment
object into a DGEList object.
o
S3 methods for SummarizedExperiment objects are added to the
following functions: aveLogCPM(), calcNormFactors(), cpm(),
cpmByGroup(), estimateDisp(), filterByExpr(), glmFit(),
glmQLFit(), plotMD(), plotMDS(), predFC(), rowsum(), rpkm(),
rpkmByGroup() and sumTechReps().
o
New cpm and rpkm methods for DGEGLM and DGELRT objects.
o
New function effectiveLibSizes() to extract normalized library
sizes from an edgeR data object or fitted model object.
o
Add as.data.frame methods for DGEExact and DGELRT objects and
remove the 'optional' argument from as.data.frame.TopTags().
o
readBismark2DGE() now forces 'files' to be character vector.
o
Add warning messages when filterByExpr() is used without
specifying group or design.
o
Add warning message when calcNormFactors() is applied to
DGEList object containing an offset matrix.
o
Rewrite User's Guide Section 3.5 on Multilevel Experiments so
that the code is valid regardless of the number of subjects in
each disease group.
Changes in version 3.28.0 (2019-10-30)
o
Add head() and tail() methods for edgeR classes.
o
Remove the 'mixed.df' argument and add a 'locfit.mixed' option
to 'trend.method' in estimateDisp() and WLEB().
o
Add two new arguments 'large.n' and 'min.prop' to
filterByExpr() to allow users to change parameters previously
hard-wired.
o
Remove 'values' and 'col' arguments to plotMD.DGELRT() and
plotMD.ExactTest() as no longer needed because of changes to
plotWithHighlights().
o
roast.DGEList() and mroast.DGEList() now pass the 'nrot'
argument to roast.default().
o
Rename dglmStdResid() to plotMeanVar2().
o
getDispersions() is no longer exported.
o
Estimated dispersions are now numeric even if NA.
o
Bug fix to goana.DGELRT() and kegga.DGELRT() when the LRT was
on more than 1 df.
Changes in version 3.26.0 (2019-05-03)
o
read10X() now automatically detects file names from latest
CellRanger version.
o
glmTreat() now checks whether 'contrast' is a matrix with
multiple columns and uses first column.
o
The TMMwzp method has been renamed to TMMwsp, but calls to
method="TMMwzp" will still be respected.
calcNormFactors(object) now returns a named vector when
'object' is a matrix, with colnames(objects) as the names.
o
New 'random' method for zscoreNBinom().
o
Add arguments 'log' and 'prior.count' to cpmByGroup() and
rpkmByGroup().
o
Bug fix to filterByExpr().
Changes in version 3.24.0 (2018-10-31)
o
New functions catchKallisto() and catchSalmon() to read outputs
from kallisto and Salmon and to compute overdispersion factors
for each transcript from bootstrap samples.
o
New function readBismark2DGE() to read coverage files created
by Bismark for BS-seq methylation data.
o
New method 'TMMwzp' for calcNormFactors() to better handle
samples with large proportions of zero counts.
o
The default value for prior.count increased from 0.25 to 2 in
cpm() and rpkm(). The new value is more generally useful and
agrees with the default values in aveLogCPM() and with the
DGEList method for plotMDS().
o
zscoreNBinom() now supports non-integer q values.
o
The scaleOffset() S3 methods for DGEList and default objects
are now registered in the NAMESPACE. Previously the functions
were exported but not registered as S3 methods.
o
The rowsum() method for DGEList objects (rowsum.DGEList) now
automatically removes gene annotation columns that are not
group-level.
o
More specific error messages from DGEList() when invalid (NA,
negative or infinite) count values are detected.
o
Bug fix to glmfit.default() when lib.size is specified.
o
Bug fix to column name returned by decideTestsDGE().
Changes in version 3.22.0 (2018-04-27)
o
New function read10X() to read 10X Genomics files.
o
New function nearestTSS() to find the nearest transcriptional
start site (TSS) for given genomic loci.
o
New function nearestReftoX() to find the element of a reference
table that is closest to each element of an incoming vector.
o
New function modelMatrixMeth() to construct design matrices for
analysis of methylation data.
o
New function filterByExpr() to filter low expression genes or
features.
o
New rowsum method for DGEList objects.
o
nbinomUnitDeviance() now respects vectors.
o
DGEList() takes 'group' from 'samples' only if samples has a
column called group.
o
decideTestsDGE() now includes a 'label' attribute, which allows
more information row.names for the summary results table from
decideTestsDGE() or decideTests().
o
Design now defaults to y$design for all the gene set tests.
o
More intuitive error messages from glmFit() when the arguments
are not conformal.
o
Update User's Guide to cite the Chen et al (2017) methylation
workflow.
o
Change glmTreat() default to lfc=log2(1.2).
o
Fix incorrect implementation of weights in
adjustedProfileLik().
o
Bug fix to glmLRT() when there is just one gene but multiple
contrasts.
o
Bug fix to cpmByGroup().
Changes in version 3.20.0 (2017-10-31)
o
DGEList() sets genes and counts to have same row.names.
o
topTags() preserves row.names.
o
estimateDisp() uses 'y$design' if it exists.
o
estimateDisp() doesn't use average log-CPM in the prior.df
calculation if 'trend.method' is 'none'.
o
estimateDisp() doesn't return trended.dispersion if
'trend.method' is 'none'.
o
Design matrix defaults to 'y$design' before 'y$samples$group'
in all the gene set testing functions.
o
New arg 'group' for mglmOneWay(). Results in slight speed
improvement for glmFit().
o
'design' arg for predFC() is now compulsory.
o
Switched 'coef.start' back to a vector in mglmOneGroup().
o
New functions cpmByGroup() and rpkmByGroup().
o
Renamed arg 'x' to 'y' in cpm() and rpkm().
o
Restored null dispersion check in glmFit().
o
Removed 'offset' arg from glmQLFit() to be consistent with
glmFit().
o
Exported CompressedMatrix subset operator.
o
Refactored C++ code with greater C++11 support to use Rcpp.
o
Streamlined input dimension checks in C++ code.
o
Supported zero-row input to addPriorCounts() C++ code.
o
Added cbind and rbind S3 methods for DGEList objects.
o
Added 'Dims' as part of the compressedMatrix class.
o
Added common methods for the compressedMatrix class.
o
Register S3 methods for compressedMatrix.
o
Added a case study of differential methylation analysis to the
user's guide.
Changes in version 3.18.0 (2017-04-25)
o
roast.DGEList(), mroast.DGEList(), fry.DGEList() and
camera.DGEList() now have explicit arguments instead of passing
arguments with ... to the default method.
o
New function scaleOffset() to ensure scale of offsets are
consistent with library sizes.
o
Added decideTests() S3 methods for DGEExact and DGELRT objects.
It now works for F-tests with multiple contrasts.
o
Report log-fold changes for redundant contrasts in F-tests with
multiple contrasts.
o
Modified plotMD() S3 method for DGELRT and DGEExact objects. It
now automatically uses decideTests() and highlights the DE
genes on the MD plot.
o
New argument 'plot' in plotMDS.DGEList().
o
Removed S3 length methods for data objects.
o
gini() now support NA values and avoids integer overflow.
Changes in version 3.16.0 (2016-10-18)
o
estimateDisp() now respects weights in calculating the APLs.
o
Added design matrix to the output of estimateDisp().
o
glmFit() constructs design matrix, if design=NULL, from
y$samples$group.
o
New argument 'null' in glmTreat(), and a change in how p-values
are calculated by default.
o
Modified the default 'main' in plotMD().
o
Created a new S3 class, compressedMatrix, to store offsets and
weights efficiently.
o
Added the makeCompressedMatrix() function to make a
compressedMatrix object.
o
Switched storage of offsets in DGEGLM objects to use the
compressedMatrix class.
o
Added the addPriorCount() function for adding prior counts.
o
Modified spliceVariants() calculation of the average log-CPM.
o
Migrated some internal calculations and checks to C++ for
greater efficiency.
Changes in version 3.14.0 (2016-05-04)
o
estimateDisp(), estimateCommonDisp(), estimateTrendedDisp(),
estimateTagwiseDisp(), splitIntoGroups() and equalizeLibSizes()
are now S3 generic functions.
o
The default method of estimateGLMTrendedDisp() and
estimateGLMTagwiseDisp() now only return dispersion estimates
instead of a list.
o
The DGEList method of estimateDisp(), estimateCommonDisp() and
estimateGLMCommonDisp() now use the common dispersion estimate
to compute AveLogCPM and store it in the output.
o
Add fry method for DGEList objects.
o
Import R core packages explicitly.
o
New function gini() to compute Gini coefficients.
o
New argument poisson.bound for glmQLFTest(). If TRUE (default),
the p-value returned by glmQLFTest() will never be less than
what would be obtained for a likelihood ratio test with NB
dispersion equal to zero.
o
New argument samples for DGEList(). It takes a data frame
containing information for each sample.
o
glmFit() now protects against zero library sizes and infinite
offset values.
o
glmQLFit.default() now avoids passing a NULL design to
.residDF().
o
cpm.default() now outputs a matrix of the same dimensions as
the input even when the input has 0 row or 0 column.
o
DGEList() pops up a warning message when zero lib.size is
detected.
o
Bug fix to calcNormFactors(method="TMM") when two libraries
have identical counts but the lib.sizes have been set unequal.
o
Add a CRISPR-Cas9 screen case study to the users' guide and
rename Nigerian case study to Yoruba.
Changes in version 3.12.0 (2015-10-14)
o
New argument tagwise for estimateDisp(), allowing users to
optionally skip estimation of tagwise dispersions, estimating
common and trended dispersions only.
o
estimateTrendedDisp() has more stable performance and does not
return negative trended dispersion estimates.
o
New plotMD() methods for DGEList, DGEGLM, DGEExact and DGELRT
objects to make a mean-difference plot (aka MA plot).
o
readDGE() now recognizes HTSeq-style meta genes.
o
Remove the F-test option from glmLRT().
o
New argument contrast for diffSpliceDGE(), allowing users to
specify the testing contrast.
o
glmTreat() returns both logFC and unshrunk.logFC in the output
table.
o
New method implemented in glmTreat() to increase the power of
the test.
o
New kegga() methods for DGEExact and DGELRT objects to perform
KEGG pathway analysis of differentially expressed genes using
Entrez Gene IDs.
o
New dimnames<- methods for DGEExact and DGELRT objects.
o
glmFit() and glmQLFit() will now accept a matrix of dispersion
values, i.e., a potentially different dispersion for each
observation.
o
Bug fix to dimnames<- method for DGEGLM objects.
o
User's Guide updated. Three old case studies are replaced by
two new comprehensive case studies.
Changes in version 3.10.0 (2015-04-17)
o
An DGEList method for romer() has been added, allowing access
to rotation gene set enrichment analysis.
o
New function dropEmptyLevels() to remove unused levels from a
factor.
o
New argument p.value for topTags(), allowing users to apply a
p-value or FDR cutoff for the results.
o
New argument prior.count for aveLogCPM().
o
New argument pch for the plotMDS method for DGEList objects.
Old argument col is now removed, but can be passed using ....
Various other improvements to the plotMDS method for DGEList
objects, better labelling of the axes and protection against
degenerate dimensions.
o
treatDGE() renamed to glmTreat() and now works with either
likelihood ratio tests or with quasi-likelihood F-tests.
o
glmQLFit() is now an S3 generic function.
o
glmQLFit() now breaks the output component s2.fit into three
separate components: df.prior, var.post and var.prior.
o
estimateDisp() now protects against fitted values of zeros,
giving more accurate dispersion estimates.
o
DGEList() now gives a message rather than an error when the
count matrix has non-unique column names.
o
Minor corrections to User's Guide.
o
requireNamespace() is now used internally instead of require()
to access functions in suggested packages.
Changes in version 3.8.0 (2014-10-14)
o
New goana() methods for DGEExact and DGELRT objects to perform
Gene Ontology analysis of differentially expressed genes using
Entrez Gene IDs.
o
New functions diffSpliceDGE(), topSpliceDGE() and
plotSpliceDGE() for detecting differential exon usage and
displaying results.
o
New function treatDGE() that tests for DE relative to a
specified log2-FC threshold.
o
glmQLFTest() is split into three functions: glmQLFit() for
fitting quasi-likelihood GLMs, glmQLFTest() for performing
quasi-likelihood F-tests and plotQLDisp() for plotting
quasi-likelihood dispersions.
o
processHairpinReads() renamed to processAmplicons() and allows
for paired end data.
o
glmFit() now stores unshrunk.coefficients from prior.count=0 as
well as shrunk coefficients.
o
estimateDisp() now has a min.row.sum argument to protect
against all zero counts.
o
APL calculations in estimateDisp() are hot-started using fitted
values from previous dispersions, to avoid discontinuous APL
landscapes.
o
adjustedProfileLik() is modified to accept starting
coefficients. glmFit() now passes starting coefficients to
mglmOneGroup().
o
calcNormFactors() is now a S3 generic function.
o
The SAGE datasets from Zhang et al (1997) are no longer
included with the edgeR package.
Changes in version 3.6.0 (2014-04-12)
o
Improved treatment of fractional counts. Previously the classic
edgeR pipeline permitted fractional counts but the glm pipeline
did not. edgeR now permits fractional counts throughout.
o
All glm-based functions in edgeR now accept quantitative
observation-level weights. The glm fitting function mglmLS()
and mglmSimple() are retired, and all glm fitting is now done
by either mglmLevenberg() or mglmOneWay().
o
New capabilities for robust estimation allowing for
observation-level outliers. In particular, the new function
estimateGLMRobustDisp() computes a robust dispersion estimate
for each gene.
o
More careful calculation of residual df in the presence of
exactly zero fitted values for glmQLFTest() and estimateDisp().
The new code allows for deflation of residual df for more
complex experimental designs.
o
New function processHairpinReads() for analyzing data from
shRNA-seq screens.
o
New function sumTechReps() to collapse counts over technical
replicate libraries.
o
New functions nbinomDeviance() and nbinomUnitDeviance. Old
function deviances.function() removed.
o
New function validDGEList().
o
rpkm() is now a generic function, and it now tries to find the
gene lengths automatically if available from the annotation
information in a DGEList object.
o
Subsetting a DGEList object now has the option of resetting to
the library sizes to the new column sums. Internally, the
subsetting code for DGEList, DGEExact, DGEGLM, DGELRT and
TopTags data objects has been simplified using the new utility
function subsetListOfArrays in the limma package.
o
To strengthen the interface and to strengthen the
object-orientated nature of the functions, the DGEList methods
for estimateDisp(), estimateGLMCommonDisp(),
estimateGLMTrendedDisp() and estimateGLMTagwiseDisp no longer
accept offset, weights or AveLogCPM as arguments. These
quantities are now always taken from the DGEList object.
o
The User's Guide has new sections on read alignment, producing
a table of counts, and on how to translate scientific questions
into contrasts when using a glm.
o
camera.DGEList(), roast.DGEList() and mroast.DGEList() now
include ... argument.
o
The main computation of exactTestByDeviance() now implemented
in C++ code.
o
The big.count argument has been removed from functions
exactTestByDeviance() and exactTestBySmallP().
o
New default value for offset in dispCoxReid.
o
More tolerant error checking for dispersion value when
computing aveLogCPM().
o
aveLogCPM() now returns a value even when all the counts are
zero.
o
The functions is.fullrank and nonEstimable are now imported
from limma.
Changes in version 3.4.0 (2013-10-15)
o
estimateDisp() now creates the design matrix correctly when the
design matrix is not given as an argument and there is only one
group. Previously this case gave an error.
o
plotMDS.DGEList now gives a friendly error message when there
are fewer than 3 data columns.
o
Updates to DGEList() so that arguments lib.size, group and
norm.factors are now set to their defaults in the function
definition rather than set to NULL. However NULL is still
accepted as a possible value for these arguments in the
function call, in which case the default value is used as if
the argument was missing.
o
Refinement to cutWithMinN() to make the bin numbers more equal
in the worst case. Also a bug fix so that cutWithMinN() does
not fail even when there are many repeated x values.
o
Refinement to computation for nbins in dispBinTrend. Now
changes more smoothly with the number of genes. trace argument
is retired.
o
Updates to help pages for the data classes.
o
Fixes to calcNormFactors with method="TMM" so that it takes
account of lib.size and refCol if these are preset.
o
Bug fix to glmQLFTest when plot=TRUE but abundance.trend=FALSE.
o
predFC() with design=NULL now uses normalization factors
correctly. However this use of predFC() to compute counts per
million is being phased out in favour of cpm().
Changes in version 3.2.0 (2013-04-04)
o
The User's Guide has a new section on between and within
subject designs and a new case study on RNA-seq profiling of
unrelated Nigerian individuals. Section 2.9 (item 2) now gives
a code example of how to pre-specify the dispersion value.
o
New functions estimateDisp() and WLEB() to automate the
estimation of common, trended and tagwise dispersions. The
function estimateDisp() provides a simpler alternative pipeline
and in principle replaces all the other dispersion estimation
functions, for both glms and for classic edgeR. It can also
incorporate automatic estimation of the prior degrees of
freedom, and can do this in a robust fashion.
o
glmLRT() now permits the contrast argument to be a matrix with
multiple columns, making the treatment of this argument
analogous to that of the coef argument.
o
glmLRT() now has a new F-test option. This option takes into
account the uncertainty with which the dispersion is estimated
and is more conservative than the default chi-square test.
o
glmQLFTest() has a number of important improvements. It now has
a simpler alternative calling sequence: it can take either a
fitted model object as before, or it can take a DGEList object
and design matrix and do the model fit itself. If provided with
a fitted model object, it now checks whether the dispersion is
of a suitable type (common or trended). It now optionally
produces a plot of the raw and shrunk residual mean deviances
versus AveLogCPM. It now has the option of robustifying the
empirical Bayes step. It now has a more careful calculation of
residual df that takes special account of cases where all
replicates in a group are identically zero.
o
The gene set test functions roast(), mroast() and camera() now
have methods defined for DGEList data objects. This facilitates
gene set testing and pathway analysis of expression profiles
within edgeR.
o
The default method of plotMDS() for DGEList objects has
changed. The new default forms log-counts-per-million and
computes Euclidean distances. The old method based on
BCV-distances is available by setting method="BCV". The
annotation of the plot axes has been improved so that the
distance method used is apparent from the plot.
o
The argument prior.count.total used for shrinking
log-fold-changes has been changed to prior.count in various
functions throughout the package, and now refers to the average
prior.count per observation rather than the total prior count
across a transcript. The treatment of prior.counts has also
been changed very slightly in cpm() when log=TRUE.
o
New function aveLogCPM() to compute the average log count per
million for each transcript across all libraries. This is now
used by all functions in the package to set AveLogCPM, which is
now the standard measure of abundance. The value for AveLogCPM
is now computed just once, and not updated when the dispersion
is estimated or when a linear model is fitted. glmFit() now
preserves the AveLogCPM vector found in the DGEList object
rather than recomputing it. The use of the old abundance
measure is being phased out.
o
The glm dispersion estimation functions are now much faster.
o
New function rpkm() to compute reads per kilobase per million
(RPKM).
o
New option method="none" for calcNormFactors().
o
The default span used by dispBinTrend() has been reduced.
o
Various improvements to internal C++ code.
o
Functions binCMLDispersion() and bin.dispersion() have been
removed as obsolete.
o
Bug fix to subsetting for DGEGLM objects.
o
Bug fix to plotMDS.DGEList to make consistent use of
norm.factors.
Changes in version 3.0.0 (2012-10-02)
o
New chapter in the User's Guide covering a number of common
types of experimental designs, including multiple groups,
multiple factors and additive models. New sections in the
User's Guide on clustering and on making tables of read counts.
Many other updates to the User's Guide and to the help pages.
o
New function edgeRUsersGuide() to open the User's Guide in a
pdf viewer.
o
Many functions have made faster by rewriting the core
computations in C++. This includes adjustedProfileLik(),
mglmLevenberg(), maximizeInterpolant() and goodTuring().
o
New argument verbose for estimateCommonDisp() and
estimateGLMCommonDisp().
o
The trended dispersion methods based on binning and
interpolation have been rewritten to give more stable results
when the number of genes is not large.
o
The amount by which the tagwise dispersion estimates are
squeezed towards the global value is now specified in
estimateTagwiseDisp(), estimateGLMTagwiseDisp() and
dispCoxReidInterpolateTagwise() by specifying the prior degrees
of freedom prior.df instead of the prior number of samples
prior.n.
o
The weighted likelihood empirical Bayes code has been
simplified or developed in a number of ways. The old functions
weightedComLik() and weightedComLikMA() are now removed as no
longer required.
o
The functions estimateSmoothing() and approx.expected.info()
have been removed as no longer recommended.
o
The span used by estimateGLMTagwiseDisp() is now chosen by
default as a decreasing function of the number of tags in the
dataset.
o
New method "loess" for the trend argument of
estimateTagwiseDisp, with "tricube" now treated as a synonym.
o
New functions loessByCol() and locfitByCol() for smoothing
columns of matrix by non-robust loess curves. These functions
are used in the weighted likelihood empirical Bayes procedures
to compute local common likelihood.
o
glmFit now shrinks the estimated fold-changes towards zero. The
default shrinkage is as for exactTest().
o
predFC output is now on the natural log scale instead of log2.
o
mglmLevenberg() is now the default glm fitting algorithm,
avoiding the occasional errors that occurred previously with
mglmLS().
o
The arguments of glmLRT() and glmQLFTest() have been simplified
so that the argument y, previously the first argument of
glmLRT, is no longer required.
o
glmQLFTest() now ensures that no p-value is smaller than what
would be obtained by treating the likelihood ratio test
statistic as chisquare.
o
glmQLFTest() now treats tags with all zero counts in replicate
arrays as having zero residual df.
o
gof() now optionally produces a qq-plot of the genewise
goodness of fit statistics.
o
Argument null.hypothesis removed from equalizeLibSizes().
o
DGEList no longer outputs a component called all.zeros.
o
goodTuring() no longer produces a plot. Instead there is a new
function goodTuringPlot() for plotting log-probability versus
log-frequency. goodTuring() has a new argument 'conf' giving
the confidence factor for the linear regression approximation.
o
Added plot.it argument to maPlot().
Changes in version 2.6.0 (2012-03-31)
o
edgeR now depends on limma.
o
Considerable work on the User's Guide. New case study added on
Pathogen inoculated arabidopsis illustrating a two group
comparison with batch effects. All the other case studies have
been updated and streamlined. New section explaining why
adjustments for GC content and mappability are not necessary in
a differential expression context.
o
New and more intuitive column headings for topTags() output.
'logFC' is now the first column. Log-concentration is now
replaced by log-counts-per-million ('logCPM'). 'PValue'
replaces 'P.Value'. These column headings are now inserted in
the table of results by exactTest() and glmLRT() instead of
being modified by the show method for the TopTags object
generated by topTags(). This means that the column names will
be correct even when users access the fitted model objects
directly instead of using the show method.
o
plotSmear() and plotMeanVar() now use logCPM instead of
logConc.
o
New function glmQLFTest() provides quasi-likelihood hypothesis
testing using F-tests, as an alternative to likelihood ratio
tests using the chisquare distribution.
o
New functions normalizeChIPtoInput() and
calcNormOffsetsforChIP() for normalization of ChIP-Seq counts
relative to input control.
o
New capabilities for formal shrinkage of the logFC. exactTest()
now incorporates formal shrinkage of the logFC, controlled by
argument 'prior.count.total'. predFC() provides similar
shrinkage capability for glms.
o
estimateCommonDisp() and estimateGLMCommonDisp() now set the
dispersion to NA when there is no replication, instead of
setting the dispersion to zero. This means that users will need
to set a dispersion value explicitly to use functions further
down the analysis pipeline.
o
New function estimateTrendedDisp() analogous to
estimateGLMTrendedDisp() but for classic edgeR.
o
The algorithms implemented in estimateTagwiseDisp() now uses
fewer grid points but interpolates, similar to
estimateGLMTagwiseDisp().
o
The power trend fitted by dispCoxReidPowerTrend() now includes
a positive asymptote. This greatly improves the fit on real
data sets. This now becomes the default method for
estimateGLMTrendedDisp() when the number of genes is less than
200.
o
New user-friendly function plotBCV() displays estimated
dispersions.
o
New argument target.size for thinCounts().
o
New utility functions getDispersion() and zscoreNBinom().
o
dimnames() methods for DGEExact, DGELRT and TopTags classes.
o
Function pooledVar() removed as no longer necessary.
o
Minor fixes to various functions to ensure correct results in
special cases.
Changes in version 2.4.0 (2011-11-01)
o New function spliceVariants() for detecting alternative exon
usage from exon-level count data.
o A choice of rejection regions is now implemented for
exactTest(), and the default is changed from one based on small
probabilities to one based on doubling the smaller of the tail
probabilities. This gives better results than the original
conditional test when the dispersion is large (especially > 1).
A Beta distribution approximation to the tail probability is
also implemented when the counts are large, making exactTest()
much faster and less memory hungry.
o estimateTagwiseDisp() now includes an abundance trend on the
dispersions by default.
o exactTest() now uses tagwise.dispersion by default if found in
the object.
o estimateCRDisp() is removed. It is now replaced by
estimateGLMCommonDisp(), estimateGLMTrendedDisp() and
estimateGLMTagwiseDisp().
o Changes to glmFit() so that it automatically detects dispersion
estimates if in data object. It uses tagwise if available, then
trended, then common.
o Add getPriorN() to calculate the weight given to the common
parameter likelihood in order to smooth (or stabilize) the
dispersion estimates. Used as default for estimateTagwiseDisp
and estimateGLMTagwiseDisp().
o New function cutWithMinN() used in binning methods.
o glmFit() now S3 generic function, and glmFit() has new method
argument specifying fitting algorithm.
o DGEGLM objects now subsettable.
o plotMDS.dge() is retired, instead a DGEList method is now
defined for plotMDS() in the limma package. One advantage is
that the plot can be repeated with different graphical
parameters without recomputing the distances. The MDS method
is also now much faster.
o Add as.data.frame method for TopTags objects.
o New function cpm() to calculate counts per million
conveniently.
o Adding args to dispCoxReidInterpolateTagwise() to give more
access to tuning parameters.
o estimateGLMTagwiseDisp() now uses trended.dispersion by default
if trended.dispersion is found.
o Change to glmLRT() to ensure character coefficient argument
will work.
o Change to maPlot() so that any really extreme logFCs are
brought back to a more reasonable scale.
o estimateGLMCommonDisp() now returns NA when there are no
residual df rather than returning dispersion of zero.
o The trend computation of the local common likelihood in
dispCoxReidInterpolateTagwise() is now based on moving averages
rather than lowess.
o Changes to binGLMDispersion() to allow trended dispersion for
data sets with small numbers of genes, but with extra warnings.
o dispDeviance() and dispPearson() now give graceful estimates
and messages when the dispersion is outside the specified
interval.
o Bug fix to mglmOneWay(), which was confusing parametrizations
when the design matrix included negative values.
o mglmOneWay() (and hence glmFit) no longer produces NA
coefficients when some of the fitted values were exactly zero.
o Changes to offset behaviour in estimateGLMCommonDisp(),
estimateGLMTrendedDisp() and estimateGLMTagwiseDisp() to fix
bug. Changes to several other functions on the way to fixing
bugs when computing dispersions in data sets with genes that
have all zero counts.
o Bug fix to mglmSimple() with matrix offset.
o Bug fix to adjustedProfLik() when there are fitted values
exactly at zero for one or more groups.
Changes in version 2.2.0 (2011-04-14)
o
Release of generalized linear model pipeline.
o
Documented topics and functions: adjustedProfileLik,
approx.expected.info, as.matrix.DGEList, betaApproxNBTest,
binCMLDispersion, binGLMDispersion, binomTest, calcNormFactors,
commonCondLogLikDerDelta, condLogLikDerDelta,
condLogLikDerSize, decideTestsDGE, "DGEExact-class",
"show,DGEExact-method", "DGEGLM-class", "show,DGEGLM-method",
"DGEList-class", DGEList, "DGELRT-class", "show,DGELRT-method",
dglmStdResid, getDispersions, dim.DGEList, dim.DGEExact,
dim.TopTags, dim.DGEGLM, dim.DGELRT, length.DGEList,
length.DGEExact, length.TopTags, length.DGEGLM, length.DGELRT,
dimnames.DGEList, "dimnames<-.DGEList", dispBinTrend,
dispCoxReid, dispDeviance, dispPearson,
dispCoxReidInterpolateTagwise, dispCoxReidSplineTrend,
dispCoxReidPowerTrend, edgeR, "edgeR-package",
equalizeLibSizes, estimateCommonDisp, estimateCRDisp,
estimateGLMCommonDisp, estimateGLMCommonDisp.DGEList,
estimateGLMCommonDisp.default, estimateGLMTagwiseDisp,
estimateGLMTagwiseDisp.DGEList, estimateGLMTagwiseDisp.default,
estimateGLMTrendedDisp, estimateGLMTrendedDisp.DGEList,
estimateGLMTrendedDisp.default, estimatePs, estimateSmoothing,
estimateTagwiseDisp, exactTest, exactTest.matrix,
expandAsMatrix, getCounts, getOffsets, glmFit, glmLRT, gof,
goodTuring, goodTuringProportions, logLikDerP, maPlot,
maximizeInterpolant, binMeanVar, pooledVar, plotMeanVar, mglm,
mglmSimple, mglmLS, mglmOneGroup, mglmOneWay, mglmLevenberg,
deviances.function, designAsFactor, movingAverageByCol,
movingAverageByCol, plotMDS.dge, plotSmear, q2qpois, q2qnbinom,
readDGE, splitIntoGroups, splitIntoGroupsPseudo, subsetting,
"[.DGEList", "[.DGEExact", "[.DGELRT", systematicSubset,
thinCounts, topTags, TopTags-class, show,TopTags-method,
"[.TopTags", Tu102, Tu98, NC1, NC2, weightedComLik,
weightedComLikMA, weightedCondLogLikDerDelta.
Changes in version 1.8.0 (2010-10-18)
o
Improvements to classic pipeline.
o
Introduction of generalized linear model pipeline.
Changes in version 1.6.0 (2010-03-12)
o
Improvements to classic pipeline.
Changes in version 1.4.0 (2009-10-28)
o
Improvements to classic pipeline.
Changes in version 1.2.0 (2009-04-21)
o
Improvements to classic pipeline.
Changes in version 1.0.0 (2008-10-29)
o
Initial release of classic pipeline.
o
Documented topics and functions: alpha.approxeb,
approx.expected.info, condLogLikDerDelta, condLogLikDerSize,
deDGE, "deDGEList-class", "show,deDGEList-method",
"DGEList-class", DGEList, "show,DGEList-method", EBList-class,
"show,EBList-method", estimatePs, exactTestNB, findMaxD2,
getData, interpolateHelper, logLikDerP, plotMA,
"plotMA,deDGEList-method", quantileAdjust, readDGE,
tau2.0.objective, topTags.