Skip to contents




Introduction

In this article we demonstrate how StreamFind can be used to evaluate ozonation of secondary wastewater effluent (i.e., effluent of the aerated biological treatment) using mass spectrometry (MS). A set of 18 mzML files are used, representing blank, influent and effluent measurements in triplicate for both positive and negative ionization modes.

basename(files)
 [1] "01_tof_ww_is_neg_blank-r001.mzML"        
 [2] "01_tof_ww_is_neg_blank-r002.mzML"        
 [3] "01_tof_ww_is_neg_blank-r003.mzML"        
 [4] "01_tof_ww_is_pos_blank-r001.mzML"        
 [5] "01_tof_ww_is_pos_blank-r002.mzML"        
 [6] "01_tof_ww_is_pos_blank-r003.mzML"        
 [7] "02_tof_ww_is_neg_influent-r001.mzML"     
 [8] "02_tof_ww_is_neg_influent-r002.mzML"     
 [9] "02_tof_ww_is_neg_influent-r003.mzML"     
[10] "02_tof_ww_is_pos_influent-r001.mzML"     
[11] "02_tof_ww_is_pos_influent-r002.mzML"     
[12] "02_tof_ww_is_pos_influent-r003.mzML"     
[13] "03_tof_ww_is_neg_o3sw_effluent-r001.mzML"
[14] "03_tof_ww_is_neg_o3sw_effluent-r002.mzML"
[15] "03_tof_ww_is_neg_o3sw_effluent-r003.mzML"
[16] "03_tof_ww_is_pos_o3sw_effluent-r001.mzML"
[17] "03_tof_ww_is_pos_o3sw_effluent-r002.mzML"
[18] "03_tof_ww_is_pos_o3sw_effluent-r003.mzML"

The showcase will use the StreamFind MassSpecEngine, which encapsulates all tools required for parsing, storing, processing and visualizing MS data. Note that not all methods/functions will be shown, as the demonstration focuses of the workflow to assess wastewater treatment. Other processing methods for MS data are available in the StreamFind package and can be found in the StreamFind reference documentation.

MassSpecEngine

The R6 MassSpecEngine class object is created using MassSpecEngine$new(), as shown below. The analyses argument can be used to add the set of ms files directly. In this demonstration, we use mzML files. However, the original ms vendor files can also be used, but they will be converted to mzML format using msConvert from ProteoWizard in the background.

# Creates a MassSpecEngine from mzML files
ms <- MassSpecEngine$new(
  metadata = list(name = "Wastewater NTA"),
  analyses = files
)
# MassSpecEngine class hierarchy
class(ms)
[1] "MassSpecEngine" "CoreEngine"     "R6"            
# Number of analyses
length(ms$Analyses)
[1] 18

Metadata

Project metadata (e.g., name, author and description) can be added to the MassSpecEngine$Metadata as a named list object as shown below. The elements of the list can be anything but must have length one.

# Adds metadata to the MassSpecEngine
ms$Metadata <- list(
  name = "Wastewater Ozonation Showcase",
  author = "Ricardo Cunha",
  description = "Demonstration project"
)

# Gets and shows the metadata
show(ms$Metadata)
name: Wastewater Ozonation Showcase
author: Ricardo Cunha
description: Demonstration project
date: 2025-06-13 12:14:59.534834
file: NA
# Gets the MassSpecEngine date
ms$Metadata@entries$date
[1] "2025-06-13 12:14:59 CEST"

Replicates and blanks

The analysis replicate names and the associated blank replicate name can be amended in the MassSpecEngine, as shown below. Alternatively, a data.frame with column names file, replicate and blank could be added as the analyses argument in MassSpecEngine$new(analyses = files) to have directly the replicate and blank replicate names assigned (more details here).

# Character vector with analysis replicate names
rpls <- c(
  rep("blank_neg", 3),
  rep("blank_pos", 3),
  rep("influent_neg", 3),
  rep("influent_pos", 3),
  rep("effluent_neg", 3),
  rep("effluent_pos", 3)
)

# Character vector with associated blank replicate names
# Note that the order should match the respective replicate
blks <- c(
  rep("blank_neg", 3),
  rep("blank_pos", 3),
  rep("blank_neg", 3),
  rep("blank_pos", 3),
  rep("blank_neg", 3),
  rep("blank_pos", 3)
)

# Amends replicate and blank names
ms$add_replicate_names(rpls)
ms$add_blank_names(blks)

# Replicates and blanks were amended
ms$Analyses$info[, c(1:3, 5)]
                               analysis    replicate     blank polarity
                                 <char>       <char>    <char>   <char>
 1:         01_tof_ww_is_neg_blank-r001    blank_neg blank_neg negative
 2:         01_tof_ww_is_neg_blank-r002    blank_neg blank_neg negative
 3:         01_tof_ww_is_neg_blank-r003    blank_neg blank_neg negative
 4:         01_tof_ww_is_pos_blank-r001    blank_pos blank_pos positive
 5:         01_tof_ww_is_pos_blank-r002    blank_pos blank_pos positive
 6:         01_tof_ww_is_pos_blank-r003    blank_pos blank_pos positive
 7:      02_tof_ww_is_neg_influent-r001 influent_neg blank_neg negative
 8:      02_tof_ww_is_neg_influent-r002 influent_neg blank_neg negative
 9:      02_tof_ww_is_neg_influent-r003 influent_neg blank_neg negative
10:      02_tof_ww_is_pos_influent-r001 influent_pos blank_pos positive
11:      02_tof_ww_is_pos_influent-r002 influent_pos blank_pos positive
12:      02_tof_ww_is_pos_influent-r003 influent_pos blank_pos positive
13: 03_tof_ww_is_neg_o3sw_effluent-r001 effluent_neg blank_neg negative
14: 03_tof_ww_is_neg_o3sw_effluent-r002 effluent_neg blank_neg negative
15: 03_tof_ww_is_neg_o3sw_effluent-r003 effluent_neg blank_neg negative
16: 03_tof_ww_is_pos_o3sw_effluent-r001 effluent_pos blank_pos positive
17: 03_tof_ww_is_pos_o3sw_effluent-r002 effluent_pos blank_pos positive
18: 03_tof_ww_is_pos_o3sw_effluent-r003 effluent_pos blank_pos positive

ProcessingStep

Data processing is performed by steps according to ProcessingStep objects. S7 ProcessingStep class objects are obtained via the respective [Engine type]Method_[method name]_[algorithm name] constructor functions, attributing the respective subclass. Below we obtain the ProcessingStep for the method FindFeatures using the algorithm openms. The parameters can be changed via the constructor arguments. Documentation for each ProcessingStep subclass can be found in the StreamFind reference documentation.

# Gets ProcessingStep for finding features using the openms algorithm
ffs <- MassSpecMethod_FindFeatures_openms(
  noiseThrInt = 1000,
  chromSNR = 3,
  chromFWHM = 7,
  mzPPM = 15,
  reEstimateMTSD = TRUE,
  traceTermCriterion = "sample_rate",
  traceTermOutliers = 5,
  minSampleRate = 1,
  minTraceLength = 4,
  maxTraceLength = 70,
  widthFiltering = "fixed",
  minFWHM = 4,
  maxFWHM = 35,
  traceSNRFiltering = TRUE,
  localRTRange = 0,
  localMZRange = 0,
  isotopeFilteringModel = "none",
  MZScoring13C = FALSE,
  useSmoothedInts = FALSE,
  intSearchRTWindow = 3,
  useFFMIntensities = FALSE,
  verbose = FALSE
)

# Prints in console the details of the ProcessingStep
show(ffs)

 StreamFind::MassSpecMethod_FindFeatures_openms 
 data_type    MassSpec
 method       FindFeatures
 required     NA
 algorithm    openms
 version      0.2.0
 software     openms
 developer    Oliver Kohlbacher
 contact      oliver.kohlbacher@uni-tuebingen.de
 link         https://openms.de/
 doi          https://doi.org/10.1038/nmeth.3959

 parameters: 
  -  noiseThrInt 1000 
  -  chromSNR 3 
  -  chromFWHM 7 
  -  mzPPM 15 
  -  reEstimateMTSD TRUE 
  -  traceTermCriterion sample_rate 
  -  traceTermOutliers 5 
  -  minSampleRate 1 
  -  minTraceLength 4 
  -  maxTraceLength 70 
  -  widthFiltering fixed 
  -  minFWHM 4 
  -  maxFWHM 35 
  -  traceSNRFiltering TRUE 
  -  localRTRange 0 
  -  localMZRange 0 
  -  isotopeFilteringModel none 
  -  MZScoring13C FALSE 
  -  useSmoothedInts FALSE 
  -  intSearchRTWindow 3 
  -  useFFMIntensities FALSE 
  -  verbose FALSE 
# Creates an ordered list with all processing steps for the MS data
workflow <- list(

  # Find features using the openms algorithm, created above
  ffs,

  # Annotation of natural isotopes and adducts
  MassSpecMethod_AnnotateFeatures_StreamFind(
    rtWindowAlignment = 0.3,
    maxIsotopes = 8,
    maxCharge = 2,
    maxGaps = 1
  ),

  # Excludes annotated isotopes and adducts
  MassSpecMethod_FilterFeatures_StreamFind(
    excludeIsotopes = TRUE,
    excludeAdducts = TRUE
  ),

  # Grouping features across analyses
  MassSpecMethod_GroupFeatures_openms(
    rtalign = FALSE,
    QT = FALSE,
    maxAlignRT = 5,
    maxAlignMZ = 0.008,
    maxGroupRT = 5,
    maxGroupMZ = 0.008,
    verbose = FALSE
  ),

  # Filter feature groups with maximum intensity below 5000 counts
  MassSpecMethod_FilterFeatures_StreamFind(
    minIntensity = 3000
  ),

  # Fill features with missing data
  # Reduces false negatives
  MassSpecMethod_FillFeatures_StreamFind(
    withinReplicate = FALSE,
    rtExpand = 2,
    mzExpand = 0.0005,
    minTracesIntensity = 1000,
    minNumberTraces = 6,
    baseCut = 0.3,
    minSignalToNoiseRatio = 3,
    minGaussianFit = 0.2
  ),

  # Calculate quality metrics for each feature
  MassSpecMethod_CalculateFeaturesQuality_StreamFind(
    filtered = FALSE,
    rtExpand = 2,
    mzExpand = 0.0005,
    minTracesIntensity = 1000,
    minNumberTraces = 6,
    baseCut = 0
  ),

  # Filter features based on minimum signal-to-noise ratio (s/n)
  # The s/n is calculated using the CalculateFeaturesQuality method
  MassSpecMethod_FilterFeatures_StreamFind(
    minSnRatio = 5
  ),

  # Filter features using other parameters via the patRoon package
  MassSpecMethod_FilterFeatures_patRoon(
    maxReplicateIntRSD = 40,
    blankThreshold = 5,
    absMinReplicateAbundance = 3
  ),

  # Finds internal standards in the MS data
  # db_is is a data.table with the
  # name, mass and expected retention time of
  # spiked internal standards, as shown below
  MassSpecMethod_FindInternalStandards_StreamFind(
    database = db_is,
    ppm = 8,
    sec = 10
  ),
  
  # Corrects matrix suppression using the TiChri method from
  # 10.1021/acs.analchem.1c00357 to better compare influent and effluent
  MassSpecMethod_CorrectMatrixSuppression_TiChri(
    mpRtWindow = 10,
    istdAssignment = "range",
    istdRtWindow = 50,
    istdN = 2
  ),

  # Loads MS1 for features not filtered
  MassSpecMethod_LoadFeaturesMS1_StreamFind(
    filtered = FALSE
  ),

  # Loads MS2 for features not filtered
  MassSpecMethod_LoadFeaturesMS2_StreamFind(
    filtered = FALSE
  ),

  # Loads feature extracted ion chromatograms (EIC)
  MassSpecMethod_LoadFeaturesEIC_StreamFind(
    filtered = FALSE
  ),

  # Performs suspect screening using the StreamFind algorithm
  # db_with_ms2 is a database with suspect chemical standards
  # includes MS2 data (i.e., fragmentation pattern) from standards
  MassSpecMethod_SuspectScreening_StreamFind(
    database = db_with_ms2,
    ppm = 10,
    sec = 15,
    ppmMS2 = 10,
    minFragments = 3
  )
)

Then, the list can be added to the MassSpecEngine. Note that the order will matter when the workflow is applied!

# Conversion of list to Workflow object with validation
workflow <- Workflow(workflow)

# Adds the workflow to the engine. The order matters!
ms$Workflow <- workflow

# Printing the data processing workflow
show(ms$Workflow)
1: FindFeatures (openms)
2: AnnotateFeatures (StreamFind)
3: FilterFeatures (StreamFind)
4: GroupFeatures (openms)
5: FilterFeatures (StreamFind)
6: FillFeatures (StreamFind)
7: CalculateFeaturesQuality (StreamFind)
8: FilterFeatures (StreamFind)
9: FilterFeatures (patRoon)
10: FindInternalStandards (StreamFind)
11: CorrectMatrixSuppression (TiChri)
12: LoadFeaturesMS1 (StreamFind)
13: LoadFeaturesMS2 (StreamFind)
14: LoadFeaturesEIC (StreamFind)
15: SuspectScreening (StreamFind)

run_workflow()

The Workflow can be applied by run_workflow(), as demonstrated below. Note that with run_workflow(), the processing modules are applied with the same order as they were added.

# Runs all ProcessingStep added
ms$run_workflow()

Results

The created features and feature groups can be inspected as data.table objects or plotted by dedicated methods in the MassSpecEngine or methods of the created Results class. Internally, the MassSpecEngine stores the results in the NonTargetAnalysisResults object, which can be accessed with MassSpecEngine$NonTargetAnalysisResults. Yet, the engine interface is recommended for accessing the results. However, the NonTargetAnalysisResults object can be used for more advanced operations, such as exporting the results to a database or other formats or use the native objects from other packages (e.g., patRoon) as we demonstrate further in this article.

data.table objects

The features and feature groups can be obtained as data.table with the MassSpecEngine$get_features() and MassSpecEngine$get_groups() methods. The methods also allow to look for specific features/feature groups using mass, mass-to-charge ratio, retention time and drift time targets, as show below for a small set of compound targets where mass and retention time expected values are known. Note that drift time is only applicable for MS data with ion mobility separation.

db
                         name       formula     mass    rt    tag
                       <char>        <char>    <num> <int> <char>
 1:     4N-Acetylsulfadiazine   C12H12N4O3S 292.0630   905      S
 2:                Metoprolol     C15H25NO3 267.1834   915      S
 3:          Sulfamethoxazole   C10H11N3O3S 253.0521  1015      S
 4:                Bisoprolol     C18H31NO4 325.2253   955      S
 5: 4N-Acetylsulfamethoxazole   C12H13N3O4S 295.0627  1011      S
 6:             Carbamazepine     C15H12N2O 236.0950  1079      S
 7:                 Terbutryn     C10H19N5S 241.1361  1126      S
 8:                  Losartan   C22H23ClN6O 422.1622  1095      S
 9:               Candesartan    C24H20N6O3 440.1597  1097      S
10:               Isoproturon     C12H18N2O 206.1419  1152      S
11:                    Diuron   C9H10Cl2N2O 232.0170  1160      S
12:                Bezafibrat   C19H20ClNO4 361.1081  1164      S
13:                 Valsartan    C24H29N5O3 435.2270  1177      S
14:              Tebuconazole   C16H22ClN3O 307.1451  1267      S
15:                Diclofenac  C14H11Cl2NO2 295.0167  1255      S
16:             Propiconazole C15H17Cl2N3O2 341.0698  1308      S
17:                Flufenacet C14H13F4N3O2S 363.0665  1296      S
18:                 Ibuprofen      C13H18O2 206.1307  1152      S
19:                      CBZD    C15H14N2O3 270.1004   936      S
# Compounds are searched by monoisotopic mass and retention time
# ppm and sec set the mass (im ppm) and time (in seconds) allowed deviation, respectively
# average applies a mean to the intensities in each analysis replicate group
ms$get_groups(mass = db, ppm = 10, sec = 15, average = TRUE)
             group             name effluent_neg effluent_pos influent_neg
            <char>           <char>        <num>        <num>        <num>
1:  M236_R1079_292    Carbamazepine         0.00         0.00        0.000
2:  M253_R1015_479 Sulfamethoxazole         0.00         0.00        0.000
3:   M267_R916_635       Metoprolol         0.00     15412.60        0.000
4: M295_R1256_1105       Diclofenac         0.00         0.00     8531.089
5:  M325_R957_1538       Bisoprolol         0.00     13762.82        0.000
6: M440_R1097_2696      Candesartan      7415.23     22512.84    34920.953
   influent_pos
          <num>
1:    69645.159
2:     9809.779
3:    51331.574
4:    21566.104
5:    48375.419
6:   112540.651

Already by inspection of the data.table, it is possible to see compounds detected in the influent but not in the effluent (e.g., Carbamazepine) or compounds that are appear to be reduced during ozonation (e.g., Metoprolol). Since positive and negative ionization mode were combined, there are compounds that appear in both polarities and are grouped by neutral monoisotopic mass (e.g., Candesartan and Diclofenac).

plot_groups methods

For a better overview of the results, the method MassSpecEngine$plot_groups() or even more detailed the method MassSpecEngine$plot_groups_overview() can be used.

# set legendNames to TRUE for using the names in db as legend
ms$plot_groups(mass = db, ppm = 10, sec = 15, legendNames = TRUE)
ms$plot_groups_overview(mass = db, ppm = 5, sec = 10, legendNames = TRUE)

Filtered not removed

The FilterFeatures method was applied to filter features according to defined conditions/thresholds. Yet, the filtered features were not removed but just tagged as filtered. For instance, when the method MassSpecEngine$get_groups() is run with filtered argument set to TRUE, the filtered features are also shown. Below, we search for the same compounds as above but with the filtered argument set to TRUE. Potential features from Valsartan are now returned but were filtered due to low intensity. Note that when extracting features based on basic parameters, i.e. mass and time, does not mean that features are identified. The identification of features is a more complex process and requires additional information, such as MS/MS data as in the processing method suspect screening.

# Set filtered to TRUE for showing filtered features/feature groups
ms$get_groups(mass = db, ppm = 5, sec = 10, average = TRUE, filtered = TRUE)
             group             name effluent_neg effluent_pos influent_neg
            <char>           <char>        <num>        <num>        <num>
1:  M236_R1079_292    Carbamazepine        0.000         0.00        0.000
2:  M253_R1015_479 Sulfamethoxazole        0.000         0.00        0.000
3:   M267_R916_635       Metoprolol        0.000     15412.60        0.000
4: M295_R1256_1105       Diclofenac        0.000         0.00     8531.089
5:  M325_R957_1538       Bisoprolol        0.000     13762.82        0.000
6: M435_R1176_2663        Valsartan     3598.701         0.00     4133.250
7: M440_R1097_2696      Candesartan     7415.230     22512.84    34920.953
   influent_pos
          <num>
1:    69645.159
2:     9809.779
3:    51331.574
4:    21566.104
5:    48375.419
6:        0.000
7:   112540.651

Internal Standards

The method FindInternalStandards was applied for tagging spiked internal standards and the results can be obtained with the dedicated method MassSpecEngine$get_internal_standards() or plotted as a quality overview using the method MassSpecEngine$plot_internal_standards(), as shown below. The plot gives an overview of the mass, retention time and intensity variance of the internal standards across the analyses in the project.

# List of spiked internal standards
db_is
                  name           formula     mass    rt    tag
                <char>            <char>    <num> <int> <char>
1: Cyclophosphamide-D6 C7[2]H6H9Cl2N2O2P 266.0625  1007     IS
2:        Ibuprofen-D3     C13[2]H3H15O2 209.1495  1150     IS
3:      Diclophenac-D4  C14[2]H4H7Cl2NO2 299.0418  1253     IS
4:       Metoprolol-D7    C15[2]H7H18NO3 274.2274   915     IS
5:  Sulfamethoxazol-D4   C10[2]H4H7N3O3S 257.0772  1015     IS
6:      Isoproturon-D6    C12[2]H6H12N2O 212.1796  1149     IS
7:           Diuron-D6   C9[2]H6H4Cl2N2O 238.0547  1157     IS
8:    Carbamazepin-D10    C15[2]H10H2N2O 246.1577  1075     IS
9:         Naproxen-D3     C14[2]H3H11O3 233.1131  1169     IS
# Gets the internal standards evaluation data.table
ms$get_internal_standards()[, 1:7]
                   name    rt     mass intensity    area   rtr    mzr
                 <char> <num>    <num>     <num>   <num> <num>  <num>
 1:    Carbamazepin-D10  1074 246.1586    462051 2880482  14.4 0.0017
 2:    Carbamazepin-D10  1074 246.1585    539574 3525752  16.6 0.0021
 3:    Carbamazepin-D10  1075 246.1586    510457 3297997  15.5 0.0012
 4: Cyclophosphamide-D6  1007 266.0627     51652  289270  13.5 0.0018
 5: Cyclophosphamide-D6  1006 266.0627     56586  315716  10.2 0.0020
 6: Cyclophosphamide-D6  1007 266.0628     60856  304027   9.4 0.0016
 7:      Diclophenac-D4  1254 299.0415     24569  149453  13.7 0.0018
 8:      Diclophenac-D4  1253 299.0423     52024  306126  12.5 0.0018
 9:      Diclophenac-D4  1255 299.0414     18342  120359  13.8 0.0027
10:      Diclophenac-D4  1254 299.0424     48453  304167  13.4 0.0018
11:      Diclophenac-D4  1254 299.0412     20070  127973  12.0 0.0027
12:      Diclophenac-D4  1254 299.0423     51945  315610  12.0 0.0016
13:           Diuron-D6  1157 238.0544     14884   87344  10.9 0.0010
14:           Diuron-D6  1157 238.0553    120508  741890  12.7 0.0015
15:           Diuron-D6  1157 238.0542     14522   85375  10.7 0.0018
16:           Diuron-D6  1157 238.0553    130401  845728  13.4 0.0015
17:           Diuron-D6  1157 238.0545     17604  102393  10.7 0.0017
18:           Diuron-D6  1157 238.0554    141741  889026  14.8 0.0012
19:      Isoproturon-D6  1149 212.1806   1198379 7589018  13.9 0.0012
20:      Isoproturon-D6  1149 212.1808   1269561 8542166  16.4 0.0014
21:      Isoproturon-D6  1149 212.1807   1322756 8587536  16.1 0.0016
22:       Metoprolol-D7   915 274.2291   1581149 9737735  16.2 0.0017
23:       Metoprolol-D7   914 274.2290   1497131 8861199  15.6 0.0026
24:       Metoprolol-D7   915 274.2289   1571305 9472216  15.5 0.0013
25:  Sulfamethoxazol-D4  1014 257.0766     10802   61534   9.6 0.0015
26:  Sulfamethoxazol-D4  1014 257.0776    193814 1114649  14.5 0.0015
27:  Sulfamethoxazol-D4  1014 257.0770      6566   29069   6.8 0.0020
28:  Sulfamethoxazol-D4  1014 257.0777    189113 1099917  13.5 0.0015
29:  Sulfamethoxazol-D4  1014 257.0770      7969   45062  10.0 0.0033
30:  Sulfamethoxazol-D4  1014 257.0780    204834 1178733  14.1 0.0012
                   name    rt     mass intensity    area   rtr    mzr
ms$plot_internal_standards()

Quality control of spiked internal standards

ms$plot_groups_profile(mass = db_is, ppm = 8, sec = 10, filtered = TRUE, legendNames = TRUE)

Internal standards profile across analyses

Components

The method AnnotateFeatures was applied to annotate the natural isotopes and adducts into components. Implementation of annotation for in-source fragments is planned but not yet available with the StreamFind algorithm. The method MassSpecEngine$get_components() can be used to search for components, as shown below for the analysis number 11. Because the filters excludeIsotopes and minIntensity were applied, the isotopic features are likely filtered.

# Components of Diclofenac and Candesartan in analysis 11
components_example <- ms$get_components(
  analyses = 11,
  mass = db[db$name %in% c("Diclofenac", "Candesartan"), ],
  ppm = 5, sec = 10,
  filtered = TRUE
)

# Subset of the components data.table
components_example[, c(2:6, 8, 30:34)]
      replicate         component           feature           group       rt
         <char>            <char>            <char>          <char>    <num>
1: influent_pos F388_MZ296_RT1256 F388_MZ296_RT1256 M295_R1256_1105 1256.670
2: influent_pos F388_MZ296_RT1256 F403_MZ298_RT1256                 1255.600
3: influent_pos F937_MZ441_RT1097 F937_MZ441_RT1097 M440_R1097_2696 1096.581
4: influent_pos F937_MZ441_RT1097 F940_MZ442_RT1097                 1096.580
5: influent_pos F937_MZ441_RT1097 F946_MZ443_RT1097 M442_R1098_2714 1096.581
    intensity i.component_feature iso_size iso_charge iso_step iso_cat
        <num>              <char>    <int>      <int>    <int>  <char>
1:  19179.480   F388_MZ296_RT1256        2          1        0     M+0
2:  14199.290   F388_MZ296_RT1256        2          1        2     M+2
3: 109887.297   F937_MZ441_RT1097        3          1        0     M+0
4:  28418.029   F937_MZ441_RT1097        3          1        1     M+1
5:   4544.722   F937_MZ441_RT1097        3          1        2     M+2

The components (i.e., isotopes and adducts) can also be visualized with the method MassSpecEngine$map_components(), as shown below for the internal standards added to analysis 11. Note that again the filtered argument was set to TRUE to return also filtered features, as the internal standards were likely excluded by blank subtraction.

ms$map_components(
  analyses = 11,
  mass = db_is,
  ppm = 8, sec = 10,
  filtered = TRUE,
  legendNames = TRUE
)

Components (i.e., isotopes and adducts) of internal standards in analysis 11

Suspects

The methods MassSpecEngine$get_suspects() and MassSpecEngine$plot_suspects() can be used to inspect the suspect screening results. In the plot function, a second plot is added to compare the experimental fragmentation pattern (top) with the fragmentation pattern of the respective reference standard (down) added within the database. The colorBy argument can be set to targets+replicates to legend the plot with combined keys of suspect target names and analysis replicate names.

ms$get_suspects()[, c(1, 5, 14)]
                               analysis             name id_level
                                 <char>           <char>   <char>
 1:      02_tof_ww_is_pos_influent-r001    Carbamazepine        1
 2:      02_tof_ww_is_pos_influent-r001 Sulfamethoxazole        1
 3:      02_tof_ww_is_pos_influent-r001       Metoprolol        1
 4:      02_tof_ww_is_pos_influent-r001       Diclofenac        1
 5:      02_tof_ww_is_pos_influent-r001       Bisoprolol        1
 6:      02_tof_ww_is_pos_influent-r001      Candesartan        1
 7:      02_tof_ww_is_pos_influent-r002    Carbamazepine        1
 8:      02_tof_ww_is_pos_influent-r002 Sulfamethoxazole       3b
 9:      02_tof_ww_is_pos_influent-r002       Metoprolol        1
10:      02_tof_ww_is_pos_influent-r002       Diclofenac        1
11:      02_tof_ww_is_pos_influent-r002       Bisoprolol        1
12:      02_tof_ww_is_pos_influent-r002      Candesartan        1
13:      02_tof_ww_is_pos_influent-r003    Carbamazepine        1
14:      02_tof_ww_is_pos_influent-r003 Sulfamethoxazole        1
15:      02_tof_ww_is_pos_influent-r003       Metoprolol        1
16:      02_tof_ww_is_pos_influent-r003       Diclofenac       3b
17:      02_tof_ww_is_pos_influent-r003       Bisoprolol        1
18:      02_tof_ww_is_pos_influent-r003      Candesartan        1
19: 03_tof_ww_is_pos_o3sw_effluent-r001       Metoprolol        1
20: 03_tof_ww_is_pos_o3sw_effluent-r001       Bisoprolol        1
21: 03_tof_ww_is_pos_o3sw_effluent-r001      Candesartan        1
22: 03_tof_ww_is_pos_o3sw_effluent-r002       Metoprolol        1
23: 03_tof_ww_is_pos_o3sw_effluent-r002       Bisoprolol        1
24: 03_tof_ww_is_pos_o3sw_effluent-r002      Candesartan        1
25: 03_tof_ww_is_pos_o3sw_effluent-r003       Metoprolol        1
26: 03_tof_ww_is_pos_o3sw_effluent-r003       Bisoprolol        1
27: 03_tof_ww_is_pos_o3sw_effluent-r003      Candesartan        1
                               analysis             name id_level
ms$plot_suspects(colorBy = "targets+replicates")

Methods from patRoon

The NonTargetAnalysisResults object holds methods to obtain original objects from the patRoon package. For instance, the S4 class features or featureGroups objects can be obtained via the get_patRoon_features method of the NonTargetAnalysisResults results. The patRoon package provides a comprehensive set of functions, as shown below. See more information in the patRoon reference documentation.

# Native patRoon object
fGroups <- get_patRoon_features(ms$NonTargetAnalysisResults, filtered = FALSE, featureGroups = TRUE)

fGroups
A featureGroupsSet object
Hierarchy:
featureGroups
    |-- featureGroupsSet
      |-- featureGroupsScreeningSet
---
Object size (indication): 26.5 MB
Algorithm: openms-set
Feature groups: M236_R1236_301, M236_R1222_300, M236_R1139_302, M240_R945_325, M240_R966_329, M242_R916_347, ... (374 total)
Features: 1662 (4.4 per group)
Has normalized intensities: FALSE
Internal standards used for normalization: no
Predicted concentrations: none
Predicted toxicities: none
Analyses: 02_tof_ww_is_neg_influent-r001, 02_tof_ww_is_neg_influent-r002, 02_tof_ww_is_neg_influent-r003, 03_tof_ww_is_neg_o3sw_effluent-r001, 03_tof_ww_is_neg_o3sw_effluent-r002, 03_tof_ww_is_neg_o3sw_effluent-r003, ... (12 total)
Replicate groups: influent_neg, effluent_neg, influent_pos, effluent_pos (4 total)
Replicate groups used as blank: blank_neg, blank_pos (2 total)
Sets: negative, positive
# Using the native patRoon's plotUpSet method
patRoon::plotUpSet(fGroups)

UpSet plot of features

Fold-change analysis

The method get_fold_change() and correspondent plot_fold_change() can be used to calculate and plot the fold-change between influent and effluent samples. The method calculates the fold-change for each feature group and replicates, as shown below. The plot shows the fold-change for each feature group across the replicates. The fold-change is calculated according to Bader et al. (2017), leveraging the replicates variance to increase the significance of the fold-change. Formation is not in the plot as no new features were detected in the effluent of the wastewater ozonation treatment step.

ms$plot_fold_change(
  replicatesIn = c("influent_neg", "influent_pos"),
  replicatesOut = c("effluent_neg", "effluent_pos"),
  filtered = FALSE,
  constantThreshold = 0.5,
  eliminationThreshold = 0.2,
  correctIntensity = TRUE,
  fillZerosWithLowerLimit = FALSE, # set to TRUE for filling zeros with lower limit argument
  lowerLimit = 200
)

Alternatively, the plotVolcano from patRoon can be used to plot fold-change. For more information check the patRoon reference documentation. Note that the function patRoon::getFCParams is used to define the parameters for the fold-change calculation. Below we use the default argument values, only the in and out replicate names are set.

patRoon::plotVolcano(
  fGroups, # obtained above with get_patRoon_features
  patRoon::getFCParams(c("influent_pos", "effluent_pos"))
)

Volcano plot of fold-change

More to come

Future assets/functionalities:

  • Annotation of in-source fragments
  • Screening of transformation products using the biotransformer tool via patRoon