Changes to version 1.4.1 (2024-01-05) + SpliceWiz is now published in Briefings in Bioinformatics! Citation updated. + buildRef now works on custom GTFs where there are incomplete transcript annotation columns. + STAR >= 2.7.8a should no longer throw error about trimming adaptors + Added missing include for cstdint + edgeR useQL = TRUE now uses legacy method (as per Lun & Smyth 2017) + Minor bugfixes Changes to version 1.3.5 (2023-10-22) + Alignment coverage output (COV) files are not identical between linux arm64 and other platforms. Unit tests pertaining to COV file reproducibility are disabled until a fix for linux arm64 is implemented. All other platforms (including Mac arm64) are still supported. Changes to version 1.3.4 (2023-10-21) + Fix & Improvement in algorithm for removal of overlapping IR events + Bugfix: scientific notation generated when writing event coordinates + Bugfix: fix various errors due to low or zero non-IR splicing events (e.g. S. cerevisiae) + Bugfix: throws informative error if gene_id are found for different chromosomes (as may occur for Gencode gff3 files v43 and prior) + Bugfix: DNA translation into peptides is skipped when sequence contains all N's (which can occur on chromosome Y paralogous genes in Gencode references) + Bugfix: gene ontology functions now compatible for Gencode references Changes to version 1.3.3 (2023-07-27) + Bugfix: Fixed compatibility for GTF references with CDS entries but no ccds_id column Changes in version 1.3.2 (2023-06-05) + Bugfix: error when running featureCounts wrapper due to non-numeric assignment of single / paired end reads Changes in version 1.3.1 (2023-05-20) + Bugfix: fixed - static plot coverages did not show when reverseGenomeCoords = TRUE + Bugfix: error when running featureCounts with overwrite = TRUE Changes in version 1.1.8 (2023-04-17) + Users will no longer need to specify separate folders for processBAM and collateData output. Instead, using the GUI, processBAM will output to the `pbOutput` subdirectory inside the specified NxtSE folder + Buttons in the Experiment creation and loading interfaces have been streamlined + Users can select and de-select events using lasso / box / click select tools in volcano and scatter plots (previously only select was possible for all except click) + A unified event filtering interface has been implemented for all visualizations + A new system for creating coverage plots has been implemented. Coverage plots are now created in a 3-step process: getCoverageData (to get coverage data!), getPlotObject (customizes data for ASE event normalization and per condition), and plotView (which actually generates the plot). This system makes it easier to refine plots without having to fetch data from the disk everytime. + All GUI visualization can now be exported directly as pdf files + Added internal functions to NxtSE object - row_gr() fetches EventRegion GRanges for each ASE Changes in version 1.1.7 (2023-03-27) + Bugfix: t-test track plots as zero any non-finite p-values (arises when all normalized coverages are the same value) + Bugfix: BAM2COV uses correct number of threads now + Bugfix: fixed duplicate junc_* elements on rbind of NxtSE objects + Feature: Add abs_deltaPSI as a column in differential analysis result (absolute value of delta-PSI) + Feature: GO interactive plot now displays more information + Feature: faster retrieval of makeMatrix and makeMeanPSI functions + Feature: makeSE() now gives more verbose loading information + Feature: improved performance in getCoverageBins() + Feature: faster retrieval from Ensembl FTP (using rvest instead of XML) + Feature: slight performance optimization of plotCoverage Changes in version 1.1.6 (2023-02-24) + Gene ontology analysis is available! This is implemented via a wrapper to fgsea's fora() function (over-representation analysis) + plotCoverage improved - now exons are plotted at higher resolution, and can be plotted in isolation (i.e., by removing intronic regions) using static plots (via as_ggplot_cov()). Further plotCoverage improvements: + better hover-info for plotly-based events + unstranded coverage now displays unstranded junction counts + fixed display of novel transcripts to only display those that are supported by junction counts in the display data. All annotated transcripts are still shown + other miscellaneous bugfixes + collateData's output is improved. Temporary output files are removed, the reference is compressed, allowing for lower storage footprint. This facilitates file transfer among collaborators. Additionally, COV files can be copied into the NxtSE folder for file-transfer purposes + Novel splicing - (optionally) tandem junctions can now be extrapolated from the data. Given known exons and observed junctions, "putative tandem junctions" are included among observed tandem junctions, during novel splicing event generation. This allows for better identification, especially for novel casette exon skipping. + Introduced StrictAltSS filter - this removes A5SS/A3SS events for which the two alternate splice sites are separated by an observed intron. + Integrated GO analysis into GUI. Heatmaps and event lists in COV can now be subsetted by top gene ontology categories. GO analysis must first be performed prior to this option being available. + Incompatibilities with prior versions: + buildRef in 1.1.6 now generates gene ontology annotation. + collateData output is incompatible with prior versions + processBAM output remains largely unchanged compared with 1.1.5 + NxtSE objects are incompatible with that of prior versions Changes in version 1.1.5 (2022-12-20) + Fix vignettes not building Changes in version 1.1.3 (2022-12-18) + Improved performance of SpliceWiz processBAM() in multi-sample setting + Fixed memory leak in processBAM and BAM2COV functions + Added edgeR-based differential ASE wrappers, including ability to construct custom model matrices to model complex experimental designs. + Overhauled STAR wrappers, added functions to allow STAR genome reference to be generated (without GTF). A temporary STAR genome can be subsequently derived by supplying a SpliceWiz reference containing the requisite GTF file. + Included more tandem junctions into novel splicing reference (will find more novel splicing events compared with versions <=1.1.2) + collateData's lowMemoryMode will now cap usage to 4 threads (instead of 1), which is expected to limit RAM usage to ~ 16-20 Gb, depending on genome size and whether novel splicing mode is on/off. To use even less memory, consider + Slightly improved runtimes of buildRef and collateData functions + collateData is now single-threaded on Windows (as MulticoreParam is not available) Changes in version 1.1.2 (2022-11-08) + Implemented time series analysis in limma using splines + Reduced loading time of makeSE's overlapping intron removal + Optimised H5 database chunking to speed up data loading times + Added installation instructions for SpliceWiz using conda environment + Bugfix: resolved mismatched chromosome issues in collateData + Bugfix: fixed novel splice counts filtering + Bugfix: Depth calculation in collateData fixed to properly reflect maximum splicing across junction + Bugfix: Fixed plotCoverage / plotGenome by coordinates Changes in version 0.99.6 (2022-10-27) + Ribbons in group-normalized coverage plots can now be customized to show standard deviation (default), 95% confidence interval, standard error of the mean, or none. + ASE_satuRn analysis now shoots warning if rows with zero counts are detected Changes in version 0.99.5 (2022-10-26) + Major feature: Junctions (split read counts and PSIs) are now visualized in coverage plots! Individual tracks are now marked by sashimi-plot style arcs labeled with their corresponding split-read counts. Group-wise coverage tracks are similarly annotated, but each junction (splice donor-acceptor) are labeled with provisional PSIs. These are calculated based on the proportion of total split reads that utilise either either exon group from which the split read arises. This is enabled by setting plotJunctions = TRUE in the plotCoverage() function. + New Feature: Users can use the satuRn package to perform quasi-binomial modelling of ASE counts. + New Feature: Reduced false positive novel ASEs. This was achieved by filtering of novel junctions by their abundance, allowing removal of low abundance novel junctions that may arise from mis-alignment or read errors. Also, the previously hard-coded restriction of novel junctions requiring one splice site being annotated is now optional (and in our view now unnecessary). Filter parameters are set within the collateData() function. + New Feature: Intron retention analysis now provides EITHER IR-ratio OR PSI. Users can select this using the "IRmode" parameter from within the ASE-methods class of functions. Choices are "all" (all introns, IR-ratio), "annotated" (annotated introns, IR-ratio), or "annotated_binary" (annotated introns, PSI). + Performance improvement: Row and column names have been removed from all assays stored in H5 database. This substantially improves loading and performance of NxtSE objects utilising on-disk (un-realized) memory. + NxtSE object now has 3 getter functions junc_PSI(), junc_counts() and junc_gr() for getting (prelim) PSIs, counts and GRanges of junction counts. The first two are DelayedMatrices linked using the makeSE() function. These matrices are never subsetted by row but only by column. Furthermore, they are only realized as on-memory matrices upon calling realize_NxtSE() and setting `includeJunctions = TRUE`+ Minor improvement: plotCoverage() now correctly plots the correct strand. Note datasets compiled from a mix of different strandedness protocols are not supported, as samples are assumed to be sequenced from the same protocol. + Bugfix: broken annotation of ALEs for novel splice reference now fixed. + Bugfix: ASE_DESeq2: previously, inclusion / exclusion counts were normalised prior to differential analysis using DESeq2. This is now fixed by specifying sizefilters() <- 1 + Bugfix: NMD data for introns - some 3'-UTR introns were mislabeled as CDS introns - now fixed + BUGFIX: fixed errors in ALE reference generation for novel reference Changes in version 0.99.3 (2022-08-24) + Added support for novel splicing + This includes modifying the buildRef and (C++-internal) processBAM functions to annotate and count tandem junctions, and collateData function to compile SpliceWiz novel-ASE reference on-the-fly + Added getter functions to view the SpliceWiz reference + Other misc bug fixes Changes in version 0.99.2 (2022-07-23) + Added wrapper functions to use the STAR aligner to process fastq files + Added "SpliceWiz cookbook" vignette to illustrate functions involving "real world" genomes / transcriptomes + Fixed bug that triggers error in ASE_limma() when two genes with different gene_id have the same gene_name in the annotation. References built from prior versions of SpliceWiz are not supported! + Bugfix for translating protein codons when source genome contains N's Changes in version 0.99.0 (2022-06-02) + Bioconductor Submission