Interactive Differential Expression Analysis Tool
Differential expression (DE) analysis has become an increasingly popular tool in determining and viewing up and/or down expressed genes between two sets of samples. The goal of differential gene expression analysis is to find genes or transcripts whose difference in expression, when accounting for the variance within condition, is higher than expected by chance. DESeq2 is an R package available via Bioconductor and is designed to normalize count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression (Love et al. 2014). With multiple parameters such as padjust values, log fold changes, plot styles, and so on, altering plots created with your DE data can be a hassle as well as time consuming. The Differential Expression Browser uses DESeq2 (Love et al., 2014), EdgeR (Robinson et al., 2010), and Limma (Ritchie et al., 2015) coupled with shiny (Chang, W. et al., 2016) to produce real-time changes within your plot queries and allows for interactive browsing of your DE results. In addition to DE analysis, DEBrowser also offers a variety of other plots and analysis tools to help visualize your data even further.
DEBrowser utilizes Shiny, a R based application development tool that creates a wonderful interactive user interface (UI) combined with all of the computing prowess of R. After the user has selected the data to analyze and has used the shiny UI to run DE analysis, the results are then input to DEBrowser. DEBrowser manipulates your results in a way that allows for interactive plotting by which changing padj or fold change limits also changes the displayed graph(s). For more details about these plots and tables, please visit our quick start guide for some helpful tutorials.
For comparisons against other popular data visualization tools, see the comparison table below (Figure 40).
Before you start; you will have to install R and/or RStudio.
# Installation instructions: # 1. Install DEBrowser and its dependencies by running the lines below # in R or RStudio. if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("debrowser") # 2. Load the library library(debrowser) # 3. Start DEBrowser startDEBrowser()
Please check Operating System Dependencies section, in case your operating system requires packages to be installed.
Once you’ve made your way to the website, or you have a local instance of DEBrowser running, you will be greeted with data loading section:
To begin the analysis, you need to upload your count data file (comma or semicolon separated (CSV), and tab separated (TSV) format) to be analyzed and choose appropriate separator for the file (comma, semicolon or tab).
Gene quantifications table can be obtained running standard software like HTSeq (Anders,S. et al, 2014) or RSEM (Li and Dewey, 2011). The file values must contain the gene, transcript(s), and the sample raw count values you wish to enter into DEBrowser.
If you do not have a dataset to upload, you can use the built in demo data file by clicking on the ‘Load Demo (Vernia et al.)!’ button. To view the entire demo data file, you can download this demo set: https://bioinfo.umassmed.edu/pub/debrowser/simple_demo.tsv . For another example, try our full dataset (Vernia et. al): https://bioinfo.umassmed.edu/pub/debrowser/advanced_demo.tsv .
The structure of the count data files are shown below:
Please also note that, DEBrowser reads the gene names from the first column and skips other non numerical columns and starts reading the quantification values from the 3rd column in this case.
In addition to the count data file; you need to upload metadata file to correct for batch effects or any other normalizing conditions you might want to address that might be within your results. To handle for these conditions, simply create a metadata file by using the example table at below or download sample file from following link: https://bioinfo.umassmed.edu/pub/debrowser/simple_demo_meta.txt
Metadata file can be formatted with comma, semicolon or tab separators similar to count data files. These files used to establish different batch effects for multiple conditions. You can have as many conditions as you may require, as long as all of the samples are present.
The example above would result in the first set of conditions as
A and second set of conditions as
B as they correspond to those conditions in the
In the same way, ‘batch’ would have the first set as
1 and second set as
2 as they correspond to those conditions in the
Once the count data and metadata files have been loaded in DEBrowser, you can click upload button to visualize your data as shown at below:
After loading the gene quantification file, and if specified the metadata file containing your batch correction fields, you then have the option to filter low counts and conduct batch effect correction prior to your analysis. Alternatively, you may skip these steps and directly continue with differential expression analysis or view quality control (QC) information of your dataset.
In this section, you can simultaneously visualize the changes of your dataset while filtering out the low count genes. Choose your filtration criteria from Filtering Methods box which is located just center of the screen. Three methods are available to be used:
After selection of filtering methods and entering threshold value, you can proceed by clicking Filter button which is located just bottom part of the Filtering Methods box. On the right part of the screen, your filtered dataset will be visualized for comparison as shown at figure below.
You can easily compare following features, before and after filtering:
Number of genes/regions.
Read counts for each sample.
Overall histogram of the dataset.
gene/region vs samples data
To investigate the gene/region vs samples data in detail as shown at below, you may click the Show Data button, located bottom part of the data tables. Alternatively, you may download all filtered data by clicking Download button which located next to Show Data button.