Package: peakPantheR
Authors: Arnaud Wolfer, Goncalo Correia
Introduction
The peakPantheR
package is designed for the detection, integration and
reporting of pre-defined features in MS files (e.g. compounds, fragments,
adducts, …).
The Parallel Annotation is set to detect and integrate multiple
compounds in multiple files in parallel and store results in a
single object. It can be employed to integrate a large number of expected
features across a dataset.
Using the faahKO raw MS dataset as an example, this vignette
will:
- Detail the Parallel Annotation concept
- Apply the Parallel Annotation to a subset of pre-defined features in the
faahKO dataset
Abbreviations
- ROI: Regions Of Interest
- reference RT / m/z windows in which to search for a feature
- uROI: updated Regions Of Interest
- modifed ROI adapted to the current dataset which override the reference
ROI
- FIR: Fallback Integration Regions
- RT / m/z window to integrate if no peak is found
- TIC: Total Ion Chromatogram
- the intensities summed across all masses for each scan
- EIC: Extracted Ion Chromatogram
- the intensities summed over a mass range, for each scan
Parallel Annotation Concept
Parallel compound integration is set to process multiple compounds in
multiple files in parallel, and store results in a single object.
To achieve this, peakPantheR
will:
- load a list of expected RT / m/z ROI and a list of files to process
- initialise an output object with expected ROI and file paths
- first pass (without peak filling) on a subset of representative samples
(e.g QC samples):
- for each file, detect features in each ROI and keep highest intensity
- determine peak statistics for each feature
- store results + EIC for each ROI
- visual inspection of first pass results, update ROI:
- diagnostic plots: all EICs, peak apex RT / m/z & peak width evolution
- correct ROI (remove interfering feature, correct RT shift)
- define fallback integration regions (FIR) if no feature is detected
(median RT / m/z start and end of found features)
- initialise a new output object, with updated regions of interest (uROI) and
fallback integration regions (FIR), with all samples
- second pass (with peak filling) on all samples:
- for each file, detect features in each uROI and keep highest intensity
- determine peak statistics for each feature
- integrate FIR when no peaks are found
- store results + EIC for each uROI
- summary statistics:
- plot EICs, apex and peakwidth evolution
- compare first and second pass
- return the resulting object and/or table (row: file, col: compound)