Package: peakPantheR
Authors: Arnaud Wolfer, Goncalo Correia

1 Introduction

The peakPantheR package is designed for the detection, integration and reporting of pre-defined features in MS files (e.g. compounds, fragments, adducts, …).

The Parallel Annotation is set to detect and integrate multiple compounds in multiple files in parallel and store results in a single object. It can be employed to integrate a large number of expected features across a dataset.

Using the faahKO raw MS dataset as an example, this vignette will:

  • Detail the Parallel Annotation concept
  • Apply the Parallel Annotation to a subset of pre-defined features in the faahKO dataset

1.1 Abbreviations

  • ROI: Regions Of Interest
    • reference RT / m/z windows in which to search for a feature
  • uROI: updated Regions Of Interest
    • modifed ROI adapted to the current dataset which override the reference ROI
  • FIR: Fallback Integration Regions
    • RT / m/z window to integrate if no peak is found
  • TIC: Total Ion Chromatogram
    • the intensities summed across all masses for each scan
  • EIC: Extracted Ion Chromatogram
    • the intensities summed over a mass range, for each scan

2 Parallel Annotation Concept

Parallel compound integration is set to process multiple compounds in multiple files in parallel, and store results in a single object.

To achieve this, peakPantheR will:

  1. load a list of expected RT / m/z ROI and a list of files to process
  2. initialise an output object with expected ROI and file paths
  3. first pass (without peak filling) on a subset of representative samples (e.g QC samples):
    • for each file, detect features in each ROI and keep highest intensity
    • determine peak statistics for each feature
    • store results + EIC for each ROI
  4. visual inspection of first pass results, update ROI:
    • diagnostic plots: all EICs, peak apex RT / m/z & peak width evolution
    • correct ROI (remove interfering feature, correct RT shift)
    • define fallback integration regions (FIR) if no feature is detected (median RT / m/z start and end of found features)
  5. initialise a new output object, with updated regions of interest (uROI) and fallback integration regions (FIR), with all samples
  6. second pass (with peak filling) on all samples:
    • for each file, detect features in each uROI and keep highest intensity
    • determine peak statistics for each feature
    • integrate FIR when no peaks are found
    • store results + EIC for each uROI
  7. summary statistics:
    • plot EICs, apex and peakwidth evolution
    • compare first and second pass
  8. return the resulting object and/or table (row: file, col: compound)