5.1 Inspecting results

Several generic functions exist that can be used to inspect data that is stored in a particular object (e.g. features, compounds etc):

Generic Classes Remarks
length() All Returns the length of the object (e.g. number of features, compounds etc)
algorithm() All Returns the name of the algorithm used to generate the object.
groupNames() All Returns all the unique identitifiers (or names) of the feature groups for which this object contains results.
names() featureGroups, components Returns names of the feature groups (similar to groupNames()) or components
show() All Prints general information.
"[[" / "$" operators All Extract general information, see below.
as.data.table() / as.data.frame() All Convert data to a data.table or data.frame, see below.
analysisInfo(), analyses(), replicateGroups() features, featureGroups Returns the analysis information, analyses or replicate groups for which this object contains data.
groupInfo() featureGroups Returns feature group information (m/z and retention time values).
screenInfo() featureGroupsScreening Returns information on hits from suspect screening.
componentInfo() components Returns information for all components.
annotatedPeakList() formulas, compounds Returns a table with annotated mass peaks (see below).

The common R extraction operators "[[", "$" can be used to obtain data for a particular feature groups, analysis etc:

# Feature table (only first columns for readability)
fList[["standard-1"]][, 1:6]
#> NULL
# Feature group intensities
fGroups$M120_R268_30
#> [1] 264836 245372 216560
fGroups[[1, "M120_R268_30"]] # only first analysis
#> [1] 264836
# obtains MS/MS peak list  (feature group averaged data)
mslists[["M120_R268_30"]]$MSMS
#>     ID       mz  intensity precursor
#>  1:  5 105.0698   6183.111     FALSE
#>  2:  6 106.0653   7643.556     FALSE
#>  3:  8 107.0728   7760.667     FALSE
#>  4: 15 120.0556 168522.667      TRUE
#>  5: 17 121.0587  13894.667     FALSE
#>  6: 18 121.0884  10032.889     FALSE
#>  7: 19 122.0964 147667.778     FALSE
#>  8: 20 123.0803  36631.111     FALSE
#>  9: 21 123.0996  15482.444     FALSE
#> 10: 22 124.0805  35580.667     FALSE
# get all formula candidates for a feature group
formulas[["M120_R268_30"]][, 1:7]
#>    neutral_formula ion_formula neutralMass ion_formula_mz error dbe isoScore
#> 1:          C6H5N3      C6H6N3    119.0483       120.0556   1.8   6  0.92461
# get all compound candidates for a feature group
compounds[["M120_R268_30"]][, 1:4]
#>     explainedPeaks     score neutralMass                   SMILES
#>  1:              0 2.7688810    119.0483        C1=CC2=NNN=C2C=C1
#>  2:              0 1.0370994    119.0483      C1=CC2=C(N=C1)N=CN2
#>  3:              0 1.0280899    119.0483        C1=CNC2=CN=CN=C21
#>  4:              0 1.0013143    119.0483        C1=CN=CC(=C1N)C#N
#>  5:              0 0.9859441    119.0483        C1=CC(=NC=C1N)C#N
#> ---                                                              
#> 33:              0 0.7395368    119.0483        C1=CC2=NC=NN2C=C1
#> 34:              0 0.7374733    119.0483        C1=CN2C=NC=C2N=C1
#> 35:              0 0.7353883    119.0483        C1=CNC2=C1C=CN=N2
#> 36:              0 0.5346503    119.0483 C1=CC(=[N+]=[N-])C=CC1=N
#> 37:              0 0.3951816    119.0483         C(C#N)C(CC#N)C#N
# get a table with information of a component
components[["CMP7"]][, 1:6]
#>            group     ret       mz isogroup isonr charge
#> 1:  M143_R206_64 205.787 143.0700       NA    NA     NA
#> 2: M159_R208_103 208.280 159.0650       NA    NA     NA
#> 3: M161_R208_104 207.582 161.0806       NA    NA     NA
#> 4: M181_R209_159 208.580 181.0469       NA    NA     NA

A more sophisticated way to obtain data from a workflow object is to use as.data.table() or as.data.frame(). These functions will convert all information within the object to a table (data.table or data.frame) and allow various options to add extra information. An advantage is that this common data format can be used with many other functions within R. The output is in a tidy format.

NOTE If you are not familiar with data.table and want to know more see data.table. Briefly, this is a more efficient and largely compatible alternative to the regular data.frame.

NOTE The as.data.frame() methods defined in patRoon simply convert the results from as.data.table(), hence, both functions are equal in their usage and are defined for the same object classes.

Some typical examples are shown below.

# obtain table with all features (only first columns for readability)
as.data.table(fList)[, 1:6]
#>             analysis                     ID     ret        mz      area intensity
#>    1:  solvent-pos-1 f_11903145554860433513  13.176  98.97537 4345232.0    391476
#>    2:  solvent-pos-1  f_7481040230307759036   7.181 100.11197  797112.1    426956
#>    3:  solvent-pos-1   f_644811898557516293 192.178 100.11211 9609998.0    750532
#>    4:  solvent-pos-1  f_5487657165652178992  19.171 100.11217 5784411.0    370376
#>    5:  solvent-pos-1  f_1469161362750806317   4.786 100.11220  551723.6    567312
#>   ---                                                                            
#> 2922: standard-pos-3 f_15297329768986667712 318.892 425.18866  666531.5    232636
#> 2923: standard-pos-3  f_8014968850794669988   9.114 427.03242  362024.1    114744
#> 2924: standard-pos-3  f_9473037910904233980 318.892 427.18678  200193.5     77768
#> 2925: standard-pos-3  f_4746450224277838191 382.682 432.23984  217612.9     97648
#> 2926: standard-pos-3  f_6439863428202709037   9.114 433.00457 3086864.0    912920
# Returns group info and intensity values for each feature group
as.data.table(fGroups, average = TRUE) # average intensities for replicates
#>              group      ret       mz standard-pos
#>   1:  M109_R192_20 191.8717 109.0759    183482.67
#>   2:  M111_R330_23 330.4078 111.0439     84598.67
#>   3:  M114_R269_25 268.6906 114.0912     85796.00
#>   4:  M116_R317_29 316.7334 116.0527    766888.00
#>   5:  M120_R268_30 268.4078 120.0554    242256.00
#>  ---                                             
#> 137: M316_R363_635 363.4879 316.1741     89904.00
#> 138: M318_R349_638 349.1072 318.1450     83320.00
#> 139: M352_R335_664 334.9403 352.2019     74986.67
#> 140: M407_R239_672 239.3567 407.2227    186568.00
#> 141: M425_R319_676 319.4944 425.1885    214990.67
# As above, but with suspect matches on separate rows and additional screening information
# (select some columns to simplify the output below)
as.data.table(fGroupsSusp, average = TRUE, collapseSuspects = NULL,
              onlyHits = TRUE)[, c("group", "susp_name", "susp_compRank", "susp_annSimBoth", "susp_estIDLevel")]
#>           group          susp_name susp_compRank susp_annSimBoth susp_estIDLevel
#> 1: M120_R268_30   1H-benzotriazole             1       0.0000000              4b
#> 2: M137_R249_53      N-Phenyl urea             1       0.6443557               5
#> 3: M146_R309_68 2-Hydroxyquinoline             2       0.9896892              3c
#> 4: M146_R248_69 2-Hydroxyquinoline            NA              NA               5
#> 5: M146_R225_70 2-Hydroxyquinoline            NA              NA               5
# Returns all peak lists for each feature group
as.data.table(mslists)
#>              group type ID       mz  intensity precursor
#>   1:  M120_R268_30   MS  1 100.1120 178952.381     FALSE
#>   2:  M120_R268_30   MS  2 102.1277 202359.667     FALSE
#>   3:  M120_R268_30   MS  3 114.0912  37647.548     FALSE
#>   4:  M120_R268_30   MS  4 115.0752  66685.238     FALSE
#>   5:  M120_R268_30   MS  5 120.0554 113335.857      TRUE
#>  ---                                                    
#> 235: M192_R355_191   MS 51 299.1274  44083.126     FALSE
#> 236: M192_R355_191   MS 52 299.1471   7390.267     FALSE
#> 237: M192_R355_191 MSMS 14 119.0496 588372.444     FALSE
#> 238: M192_R355_191 MSMS 18 120.0524  70273.333     FALSE
#> 239: M192_R355_191 MSMS 31 192.1384  71978.667      TRUE
# Returns all formula candidates for each feature group with scoring
# information, neutral loss etc
as.data.table(formulas)[, 1:6]
#>            group neutral_formula ion_formula neutralMass ion_formula_mz      error
#> 1:  M120_R268_30          C6H5N3      C6H6N3    119.0483       120.0556 1.80000000
#> 2:  M137_R249_53         C7H8N2O     C7H9N2O    136.0637       137.0709 2.90000000
#> 3:  M146_R309_68          C9H7NO      C9H8NO    145.0528       146.0600 1.66666667
#> 4: M192_R355_191        C12H17NO    C12H18NO    191.1310       192.1383 0.03333333
# Returns all compound candidates for each feature group with scoring and other metadata
as.data.table(compounds)[, 1:4]
#>              group explainedPeaks     score neutralMass
#>   1:  M120_R268_30              0 2.7688810    119.0483
#>   2:  M120_R268_30              0 1.0370994    119.0483
#>   3:  M120_R268_30              0 1.0280899    119.0483
#>   4:  M120_R268_30              0 1.0013143    119.0483
#>   5:  M120_R268_30              0 0.9859441    119.0483
#>  ---                                                   
#> 275: M192_R355_191              0 0.8852525    191.1310
#> 276: M192_R355_191              0 0.8822446    191.1310
#> 277: M192_R355_191              1 0.8814458    191.1310
#> 278: M192_R355_191              0 0.8584452    191.1310
#> 279: M192_R355_191              1 0.7970401    191.1310
# Returns table with all components (including feature group info, annotations etc)
as.data.table(components)[, 1:6]
#>      name  cmp_ret cmp_retsd       neutral_mass       analysis size
#>  1:  CMP1 347.2914 0.0000000               <NA> standard-pos-2    2
#>  2:  CMP1 347.2914 0.0000000               <NA> standard-pos-2    2
#>  3:  CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3    6
#>  4:  CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3    6
#>  5:  CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3    6
#> ---                                                                
#> 88: CMP29 313.3475 0.3105035               <NA> standard-pos-2    3
#> 89: CMP29 313.3475 0.3105035               <NA> standard-pos-2    3
#> 90: CMP30 268.3430 0.3840764           81.08705 standard-pos-1    3
#> 91: CMP30 268.3430 0.3840764           81.08705 standard-pos-1    3
#> 92: CMP30 268.3430 0.3840764           81.08705 standard-pos-1    3

Finally, the annotatedPeakList() function is useful to inspect annotation results for a formula or compound candidate:

# formula annotations for the first formula candidate of feature group M137_R249_53
annotatedPeakList(formulas, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)
#>     ID        mz intensity precursor ion_formula dbe ion_formula_mz error neutral_loss annotated
#>  1:  2  94.06500  9406.111     FALSE       C6H8N 3.5       94.06513  1.30         CHNO      TRUE
#>  2:  6  98.97522  2212.000     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#>  3:  7 105.06971  1662.111     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#>  4: 14 120.04434  7176.222     FALSE      C7H6NO 5.5      120.04439  0.40          H3N      TRUE
#>  5: 19 122.07222  2246.000     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#>  6: 21 135.08004  1565.556     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#>  7: 23 137.07039  5348.667      TRUE     C7H9N2O 4.5      137.07094  3.35                   TRUE
#>  8: 24 137.09572  2026.889     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#>  9: 26 138.09116 12356.667     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
#> 10: 27 139.07503  5020.667     FALSE        <NA>  NA             NA    NA         <NA>     FALSE
# compound annotation for first candidate of feature group M137_R249_53
annotatedPeakList(compounds, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)
#>     ID        mz intensity precursor ion_formula ion_formula_MF neutral_loss score annotated
#>  1:  2  94.06500  9406.111     FALSE       C6H8N   [C6H6N+H]+H+         CHNO   405      TRUE
#>  2:  6  98.97522  2212.000     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  3:  7 105.06971  1662.111     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  4: 14 120.04434  7176.222     FALSE      C7H6NO      [C7H6NO]+          H3N   305      TRUE
#>  5: 19 122.07222  2246.000     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  6: 21 135.08004  1565.556     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  7: 23 137.07039  5348.667      TRUE        <NA>           <NA>         <NA>    NA     FALSE
#>  8: 24 137.09572  2026.889     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  9: 26 138.09116 12356.667     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#> 10: 27 139.07503  5020.667     FALSE        <NA>           <NA>         <NA>    NA     FALSE

More advanced examples for these functions are shown below.

# Feature table, can also be accessed by numeric index
fList[[1]]
mslists[["standard-1", "M120_R268_30"]] # feature data (instead of feature group averaged)
formulas[[1, "M120_R268_30"]] # feature data (if available, i.e. calculateFeatures=TRUE)
components[["CMP1", 1]] # only for first feature group in component

as.data.frame(fList) # classic data.frame format, works for all objects
as.data.table(fGroups) # return non-averaged intensities (default)
as.data.table(fGroups, features = TRUE) # include feature information
as.data.table(mslists, averaged = FALSE) # peak lists for each feature
as.data.table(mslists, fGroups = fGroups) # add feature group information

as.data.table(formulas, countElements = c("C", "H")) # include C/H counts (e.g. for van Krevelen plots)
# add various information for organic matter characterization (common elemental
# counts/ratios, classifications etc)
as.data.table(formulas, OM = TRUE)

as.data.table(compounds, fGroups = fGroups) # add feature group information
as.data.table(compounds, fragments = TRUE) # include information of all annotated fragments

annotatedPeakList(formulas, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, onlyAnnotated = TRUE) # only include annotated peaks
annotatedPeakList(compounds, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, formulas = formulas) # include formula annotations