7.3 Example workflows

The next subsections demonstrate several approaches to perform a TP screening workflow with patRoon. In all examples it is assumed that feature groups were already obtained (with the findFeatures and groupFeatures functions) and stored in the fGroups variable.

The workflows with patRoon are designed to be flexible, and the examples here are primarily meant to implement your own workflow. Furthermore, some of the techniques used in the examples can also be combined. For instance, the Fold change classification and MS/MS similarity filters applied in the fourth example could also be applied to any of the other examples.

7.3.1 Screen predicted TPs for targets

The first example is a simple workflow where TPs are predicted for a set of given parents with BioTransformer and subsequently screened. A MetFrag compound database is generated and used for annotation.

# predict TPs for a fixed list of parents
TPs <- generateTPs("biotransformer", parents = patRoonData::suspectsPos)

# screen for the TPs
suspectsTPs <- convertToSuspects(TPs, includeParents = FALSE)
fGroupsTPs <- screenSuspects(fGroups, suspectsTPs, adduct = "[M+H]+", onlyHits = TRUE)

# perform annotation of TPs
mslistsTPs <- generateMSPeakLists(fGroupsTPs, "mzr")
convertToMFDB(TPs, "TP-database.csv", includeParents = FALSE) # generate MetFrag database
compoundsTPs <- generateCompounds(fGroupsTPs, mslistsTPs, "metfrag", adduct = "[M+H]+", database = "csv",
                                  extraOpts = list(LocalDatabasePath = "TP-database.csv"))

7.3.2 Screening TPs from a library for suspects

In this example TPs of interest are obtained for the parents that surfaced from of a suspect screening. The steps of this workflow are:

  1. Suspect screening parents.
  2. Obtain TPs for the suspect hits from a library.
  3. A second suspect screening is performed for TPs and the original parent screening results are amended. Note that the parent data is needed for componentization.
  4. Both parents and TPs are annotated using a database generated from their chemical structures.
  5. Some prioritization is performed by
    1. Only keeping candidate structures for which in-silico fragmentation resulted in at least one annotated MS/MS peak.
    2. Only keeping suspect hits with an estimated identification level of 3 or better.
  6. The TP components are made and only feature groups with parent/TP assignments are kept.
  7. All results are reported.
# step 1
fGroupsScr <- screenSuspects(fGroups, patRoonData::suspectsPos, adduct = "[M+H]+")
# step 2
TPs <- generateTPs("library", parents = fGroupsScr)

# step 3
suspects <- convertToSuspects(TPs)
fGroupsScr <- screenSuspects(fGroupsScr, suspects, adduct = "[M+H]+", onlyHits = TRUE, amend = TRUE)

# step 4
mslistsScr <- generateMSPeakLists(fGroupsScr, "mzr")
convertToMFDB(TPs, "TP-database.csv", includeParents = TRUE)
compoundsScr <- generateCompounds(fGroupsScr, mslistsScr, "metfrag", adduct = "[M+H]+", database = "csv",
                                  extraOpts = list(LocalDatabasePath = "TP-database.csv"))

# step 5a
compoundsScr <- filter(compoundsScr, minExplainedPeaks = 1)

# step 5b
fGroupsScrAnn <- annotateSuspects(fGroupsScr, MSPeakLists = mslistsScr,
                                  compounds = compoundsScr)
fGroupsScrAnn <- filter(fGroupsScrAnn, maxLevel = 3, onlyHits = TRUE)

# step 6
componTP <- generateComponents(fGroupsScrAnn, "tp", TPs = TPs, MSPeakLists = mslistsScr,
                               compounds = compoundsScr)
fGroupsScrAnn <- fGroupsScrAnn[results = componTP]

# step 7
report(fGroupsScrAnn, MSPeakLists = mslistsScr, compounds = compoundsScr,
       components = componTP, TPs = TPs)

7.3.3 Non-target screening of predicted TPs

This example uses metabolic logic to calculate possible TPs for all feature groups from a complete non-target screening. This example demonstrates how a workflow can be performed when little is known about the identity of the parents. The steps of this workflow are:

  1. Formula annotations are performed for all feature groups.
  2. These results are then limited to the top 5 candidates, and only feature groups with annotations are kept.
  3. The TPs are calculated for all remaining feature groups.
  4. A suspect screening is performed to find the TPs. Unlike the previous example feature groups without hits are kept (discussed here).
  5. The components are generated
  6. The components are filtered:
    1. The TPs must follow an expected retention time direction
    2. The parent/TPs should have at least one candidate formula that fits with the transformation.
  7. Only feature groups are kept with parent/TP assignments and all results are reported.
# steps 1-2
mslists <- generateMSPeakLists(fGroups, "mzr")
formulas <- generateFormulas(fGroups, mslists, "genform", adduct = "[M+H]+")
formulas <- filter(formulas, topMost = 5)
fGroups <- fGroups[results = formulas]

# step 3
TPs <- generateTPs("logic", fGroups = fGroups, adduct = "[M+H]+")

# step 4
suspects <- convertToSuspects(TPs)
fGroupsScr <- screenSuspects(fGroups, suspects, adduct = "[M+H]+", onlyHits = FALSE)

# step 5
componTP <- generateComponents(fGroupsScr, "tp", TPs = TPs, MSPeakLists = mslists, formulas = formulas)

# step 6
componTP <- filter(componTP, retDirMatch = TRUE, formulas = formulas)

# step 7
fGroupsScr <- fGroupsScr[results = componTP]
report(fGroupsScr, MSPeakLists = mslists, formulas = formulas, components = componTP)

7.3.4 Non-target screening of TPs by annotation similarities

This example shows a workflow where no TP data from a prediction or library is used. Instead, this workflow relies on statistics and MS/MS data to find feature groups which may potentially have a parent - TP relationship. The workflow is similar to that of the previous example. The steps of this workflow are:

  1. Fold changes (FC) between two sample groups are calculated to classify which feature groups are decreasing (i.e. parents) or increasing (i.e. TPs).
  2. Feature groups without classification are removed.
  3. Formula annotations are performed like the previous example.
  4. The componentization is performed and the FC classifications are used to specify which feature groups are to be considered parents or TPs.
  5. Only TPs are kept that show a high MS/MS spectral similarity and share at least one fragment with their parent.
  6. Only feature groups are kept with parent/TP assignments and all results are reported.
# step 1
tab <- as.data.table(fGroups, FCParams = getFCParams(c("before", "after")))
groupsParents <- tab[classification == "decrease"]$group
groupsTPs <- tab[classification == "increase"]$group

# step 2
fGroups <- fGroups[, union(groupsParents, groupsTPs)]

# step 3
mslists <- generateMSPeakLists(fGroups, "mzr")
formulas <- generateFormulas(fGroups, mslists, "genform", adduct = "[M+H]+")
formulas <- filter(formulas, topMost = 5)
fGroups <- fGroups[results = formulas]

# step 4
componTP <- generateComponents(algorithm = "tp",
                               fGroups = fGroups[, groupsParents],
                               fGroupsTPs = fGroups[, groupsTPs],
                               MSPeakLists = mslists, formulas = formulas)

# step 5
componTP <- filter(componTP, minSpecSimBoth = 0.75, minFragMatches = 1)

# step 6
fGroups <- fGroups[results = componTP]
report(fGroups, MSPeakLists = mslists, formulas = formulas, components = componTP)