5.1 Inspecting results

Several generic functions exist that can be used to inspect data that is stored in a particular object (e.g. features, compounds etc):

Generic Classes Remarks
length() All Returns the length of the object (e.g. number of features, compounds etc)
algorithm() All Returns the name of the algorithm used to generate the object.
groupNames() All Returns all the unique identitifiers (or names) of the feature groups for which this object contains results.
names() featureGroups, components Returns names of the feature groups (similar to groupNames()) or components
show() All Prints general information.
"[[" / "$" operators All Extract general information, see below.
as.data.table() / as.data.frame() All Convert data to a data.table or data.frame, see below.
analysisInfo(), analyses(), replicates() features, featureGroups Returns the analysis information, analyses or replicates for which this object contains data.
groupInfo() featureGroups Returns feature group information (m/z and retention time values).
screenInfo() featureGroupsScreening Returns information on hits from suspect screening.
componentInfo() components Returns information for all components.
annotatedPeakList() formulas, compounds Returns a table with annotated mass peaks (see below).

The common R extraction operators "[[", "$" are used to obtain data for a particular feature groups, analysis etc:

# Feature table (only first columns for readability)
fList[["standard-pos-1"]][, 1:6]
#>                          ID     ret        mz       area intensity  retmin
#>                      <char>   <num>     <num>      <num>     <num>   <num>
#>   1: f_16347712658720538540  14.139  98.97533 4778875.00    330176   5.945
#>   2: f_10395879499927939455   4.750  98.97542   96640.85    125856   4.351
#>   3:   f_453007312256804411   7.144 100.11199  470442.00    283356   5.945
#>   4: f_17185811996419289811  28.127 100.11208 5225358.00    304644  11.141
#>   5:  f_6923213961671746702   4.550 100.11208  168080.50     76724   1.961
#>  ---                                                                      
#> 543:  f_2689651329902935219 383.363 415.21303  364565.80    135352 381.122
#> 544:  f_7335751044087049528   9.143 425.15511  415152.60    121928   8.143
#> 545: f_14461495817845968238 319.805 425.18879  732124.40    210844 317.060
#> 546:  f_3994796688206637648  10.142 427.03246  365056.90    114896   8.143
#> 547:  f_3073950847809437369   9.143 433.00456 3165097.00    946000   8.143
# Feature group intensities
fGroups$M120_R268_30
#> [1] 264836 245372 216560
fGroups[[1, "M120_R268_30"]] # only first analysis
#> [1] 264836
# obtains MS/MS peak list  (feature group averaged data)
mslists[["M120_R268_30"]]$MSMS
#>        ID       mz  intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor
#>     <int>    <num>      <num>                <num>                <num>              <num>              <num>    <lgcl>
#>  1:     5 105.0698   6183.111                    1                    3                  1           2.666667     FALSE
#>  2:     6 106.0652   7643.556                    1                    3                  1           2.666667     FALSE
#>  3:     8 107.0728   7760.667                    1                    3                  1           2.666667     FALSE
#>  4:    15 120.0556 168522.672                    1                    3                  1           2.666667      TRUE
#>  5:    17 121.0587  13894.667                    1                    3                  1           2.666667     FALSE
#>  6:    18 121.0883  10032.888                    1                    3                  1           2.666667     FALSE
#>  7:    19 122.0964 147667.766                    1                    3                  1           2.666667     FALSE
#>  8:    20 123.0803  36631.109                    1                    3                  1           2.666667     FALSE
#>  9:    21 123.0996  15482.445                    1                    3                  1           2.666667     FALSE
#> 10:    22 124.0806  35580.668                    1                    3                  1           2.666667     FALSE
# get all formula candidates for a feature group
formulas[["M120_R268_30"]][, 1:7]
#>    neutral_formula ion_formula neutralMass ion_formula_mz    error   dbe isoScore
#>             <char>      <char>       <num>          <num>    <num> <num>    <num>
#> 1:          C6H5N3      C6H6N3    119.0483       120.0556 1.566667     6  0.92461
# get all compound candidates for a feature group
compounds[["M120_R268_30"]][, 1:4]
#>     explainedPeaks     score neutralMass                   SMILES
#>              <int>     <num>       <num>                   <char>
#>  1:              0 3.0000000    119.0483        C1=CC2=NNN=C2C=C1
#>  2:              0 0.7542007    119.0483      C1=CC2=C(N=C1)N=CN2
#>  3:              0 0.4403258    119.0483        C1=CNC2=CN=CN=C21
#>  4:              0 0.3780081    119.0483      C1=CC2=C(C=NN2)N=C1
#>  5:              0 0.3366106    119.0483      C1=CN2C(=CC=N2)N=C1
#> ---                                                              
#> 37:              0 0.1259247    119.0483          C1=CN=CN=C1CC#N
#> 38:              0 0.1258830    119.0483        CC1=CN=CC(=N1)C#N
#> 39:              0 0.1250904    119.0483          CC1=CN=CN=C1C#N
#> 40:              0 0.1250209    119.0483          C#CC1=NC=CN=C1N
#> 41:              0 0.1250000    119.0483 C1=CC(=[N+]=[N-])C=CC1=N
# get a table with information of a component
components[["CMP7"]][, 1:6]
#>            group     ret       mz isogroup isonr charge
#>           <char>   <num>    <num>    <num> <num>  <num>
#> 1:  M143_R206_64 205.787 143.0700       NA    NA     NA
#> 2: M159_R208_103 208.280 159.0650       NA    NA     NA
#> 3: M161_R208_104 207.582 161.0806       NA    NA     NA
#> 4: M181_R209_159 208.580 181.0469       NA    NA     NA

A more sophisticated way to obtain data from a workflow object is to use as.data.table() or as.data.frame(). These functions will convert all information within the object to a table (data.table or data.frame) and allow various options to add extra information. An advantage is that this common data format can be used with many other functions within R. The output is in a tidy format.

NOTE If you are not familiar with data.table and want to know more see data.table. Briefly, this is a more efficient and largely compatible alternative to the regular data.frame.

NOTE The as.data.frame() methods defined in patRoon simply convert the results from as.data.table(), hence, both functions are equal in their usage and are defined for the same object classes.

Some typical examples are shown below.

# obtain table with all features (only first columns for readability)
as.data.table(fList)[, 1:6]
#>             analysis                     ID     ret        mz      area intensity
#>               <char>                 <char>   <num>     <num>     <num>     <num>
#>    1:  solvent-pos-1 f_12860273722894428192  13.176  98.97537 4345232.0    391476
#>    2:  solvent-pos-1  f_9596132961704643617   7.181 100.11197  797112.1    426956
#>    3:  solvent-pos-1  f_6371335420420248621 192.178 100.11211 9609998.0    750532
#>    4:  solvent-pos-1  f_6506496206746423615  19.171 100.11217 5784411.0    370376
#>    5:  solvent-pos-1  f_5121157124211719533   4.786 100.11220  551723.6    567312
#>   ---                                                                            
#> 2922: standard-pos-3  f_2042036435154018996 318.892 425.18866  666531.5    232636
#> 2923: standard-pos-3 f_10839998681702082513   9.114 427.03242  362024.1    114744
#> 2924: standard-pos-3 f_15164217360460697802 318.892 427.18678  200193.5     77768
#> 2925: standard-pos-3  f_8695446765189507635 382.682 432.23984  217612.9     97648
#> 2926: standard-pos-3  f_4776065115895602396   9.114 433.00457 3086864.0    912920
# Returns group info and intensity values for each feature group
as.data.table(fGroups, average = TRUE) # average intensities for replicates
#>              group      ret       mz standard-pos_intensity
#>             <char>    <num>    <num>                  <num>
#>   1:  M109_R192_20 191.8717 109.0759              183482.67
#>   2:  M111_R330_23 330.4078 111.0439               84598.67
#>   3:  M114_R269_25 268.6906 114.0912               85796.00
#>   4:  M116_R317_29 316.7334 116.0527              766888.00
#>   5:  M120_R268_30 268.4078 120.0554              242256.00
#>  ---                                                       
#> 137: M316_R363_635 363.4879 316.1741               89904.00
#> 138: M318_R349_638 349.1072 318.1450               83320.00
#> 139: M352_R335_664 334.9403 352.2019               74986.67
#> 140: M407_R239_672 239.3567 407.2227              186568.00
#> 141: M425_R319_676 319.4944 425.1885              214990.67
# As above, but with suspect matches on separate rows and additional screening information
# (select some columns to simplify the output below)
as.data.table(fGroupsSusp, average = TRUE, collapseSuspects = NULL,
              onlyHits = TRUE)[, c("group", "susp_name", "susp_compRank", "susp_annSimComp", "susp_estIDLevel")]
#>           group          susp_name susp_compRank susp_annSimComp susp_estIDLevel
#>          <char>             <char>         <int>           <num>          <char>
#> 1: M120_R268_30   1H-benzotriazole             1       0.0000000              4b
#> 2: M137_R249_53      N-Phenyl urea             1       0.6443557              3a
#> 3: M146_R309_68 2-Hydroxyquinoline             2       0.9896892              3a
#> 4: M146_R248_69 2-Hydroxyquinoline            NA              NA               5
#> 5: M146_R225_70 2-Hydroxyquinoline            NA              NA               5
# Returns all peak lists for each feature group
as.data.table(mslists)
#>              group   type    ID       mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor
#>             <char> <char> <int>    <num>     <num>                <num>                <num>              <num>              <num>    <lgcl>
#>   1:  M120_R268_30     MS     1 100.1120 178952.38            1.0000000                    3          1.0000000           7.666667     FALSE
#>   2:  M120_R268_30     MS     2 102.1277 202359.67            1.0000000                    3          1.0000000           7.666667     FALSE
#>   3:  M120_R268_30     MS     3 114.0912  37647.55            1.0000000                    3          0.5654762           4.333333     FALSE
#>   4:  M120_R268_30     MS     4 115.0752  66685.24            1.0000000                    3          1.0000000           7.666667     FALSE
#>   5:  M120_R268_30     MS     5 120.0555 113335.85            1.0000000                    3          1.0000000           7.666667      TRUE
#>  ---                                                                                                                                        
#> 236: M192_R355_191     MS    52 298.1328  16943.31            0.6666667                    2          0.2545455           2.666667     FALSE
#> 237: M192_R355_191     MS    53 299.1274  45880.92            1.0000000                    3          0.4060606           4.333333     FALSE
#> 238: M192_R355_191   MSMS    14 119.0496 588372.44            1.0000000                    3          1.0000000           3.000000     FALSE
#> 239: M192_R355_191   MSMS    19 120.0524  70273.34            1.0000000                    3          1.0000000           3.000000     FALSE
#> 240: M192_R355_191   MSMS    32 192.1383  71978.66            1.0000000                    3          1.0000000           3.000000      TRUE
# Returns all formula candidates for each feature group with scoring
# information, neutral loss etc
as.data.table(formulas)[, 1:6]
#>            group neutral_formula ion_formula neutralMass ion_formula_mz     error
#>           <char>          <char>      <char>       <num>          <num>     <num>
#> 1:  M120_R268_30          C6H5N3      C6H6N3    119.0483       120.0556  1.566667
#> 2:  M137_R249_53         C7H8N2O     C7H9N2O    136.0637       137.0709  2.400000
#> 3:  M146_R309_68          C9H7NO      C9H8NO    145.0528       146.0600  1.400000
#> 4: M192_R355_191        C12H17NO    C12H18NO    191.1310       192.1383 -1.966667
# Returns all compound candidates for each feature group with scoring and other metadata
as.data.table(compounds)[, 1:4]
#>              group explainedPeaks     score neutralMass
#>             <char>          <int>     <num>       <num>
#>   1:  M120_R268_30              0 3.0000000    119.0483
#>   2:  M120_R268_30              0 0.7542007    119.0483
#>   3:  M120_R268_30              0 0.4403258    119.0483
#>   4:  M120_R268_30              0 0.3780081    119.0483
#>   5:  M120_R268_30              0 0.3366106    119.0483
#>  ---                                                   
#> 293: M192_R355_191              1 0.7481879    191.1310
#> 294: M192_R355_191              1 0.7443703    191.1310
#> 295: M192_R355_191              1 0.7442086    191.1310
#> 296: M192_R355_191              1 0.7437880    191.1310
#> 297: M192_R355_191              1 0.7437233    191.1310
# Returns table with all components (including feature group info, annotations etc)
as.data.table(components)[, 1:6]
#>       name  cmp_ret cmp_retsd       neutral_mass       analysis  size
#>     <char>    <num>     <num>             <char>         <char> <int>
#>  1:   CMP1 347.2914 0.0000000               <NA> standard-pos-2     2
#>  2:   CMP1 347.2914 0.0000000               <NA> standard-pos-2     2
#>  3:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#>  4:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#>  5:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#> ---                                                                  
#> 88:  CMP29 313.3475 0.3105035               <NA> standard-pos-2     3
#> 89:  CMP29 313.3475 0.3105035               <NA> standard-pos-2     3
#> 90:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3
#> 91:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3
#> 92:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3

Finally, the annotatedPeakList() function is useful to inspect annotation results for a formula or compound candidate:

# formula annotations for the first formula candidate of feature group M137_R249_53
annotatedPeakList(formulas, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)
#>        ID        mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor ion_formula   dbe ion_formula_mz     error neutral_loss annotated
#>     <int>     <num>     <num>                <num>                <num>              <num>              <num>    <lgcl>      <char> <num>          <num>     <num>       <char>    <lgcl>
#>  1:     2  94.06500  9406.110                    1                    3                  1           3.333333     FALSE       C6H8N   3.5       94.06513 1.3000000         CHNO      TRUE
#>  2:     6  98.97521  2212.000                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#>  3:     7 105.06971  1662.111                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#>  4:    14 120.04435  7176.222                    1                    3                  1           3.333333     FALSE      C7H6NO   5.5      120.04439 0.3666667          H3N      TRUE
#>  5:    19 122.07218  2246.000                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#>  6:    21 135.08005  1565.556                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#>  7:    23 137.07040  5348.667                    1                    3                  1           3.333333      TRUE     C7H9N2O   4.5      137.07094 3.2500000                   TRUE
#>  8:    24 137.09570  2026.889                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#>  9:    26 138.09116 12356.667                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
#> 10:    27 139.07501  5020.667                    1                    3                  1           3.333333     FALSE        <NA>    NA             NA        NA         <NA>     FALSE
# compound annotation for first candidate of feature group M137_R249_53
annotatedPeakList(compounds, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)
#>        ID        mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor ion_formula ion_formula_MF neutral_loss score annotated
#>     <int>     <num>     <num>                <num>                <num>              <num>              <num>    <lgcl>      <char>         <char>       <char> <num>    <lgcl>
#>  1:     2  94.06500  9406.110                    1                    3                  1           3.333333     FALSE       C6H8N   [C6H6N+H]+H+         CHNO   405      TRUE
#>  2:     6  98.97521  2212.000                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  3:     7 105.06971  1662.111                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  4:    14 120.04435  7176.222                    1                    3                  1           3.333333     FALSE      C7H6NO      [C7H6NO]+          H3N   305      TRUE
#>  5:    19 122.07218  2246.000                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  6:    21 135.08005  1565.556                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  7:    23 137.07040  5348.667                    1                    3                  1           3.333333      TRUE        <NA>           <NA>         <NA>    NA     FALSE
#>  8:    24 137.09570  2026.889                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  9:    26 138.09116 12356.667                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#> 10:    27 139.07501  5020.667                    1                    3                  1           3.333333     FALSE        <NA>           <NA>         <NA>    NA     FALSE

More advanced examples for these functions are shown below.

# Feature table, can also be accessed by numeric index
fList[[1]]
mslists[["standard-pos-1", "M120_R268_30"]] # feature data (instead of feature group averaged)
formulas[[1, "M120_R268_30"]] # feature data (if available, i.e. calculateFeatures=TRUE)
components[["CMP1", 1]] # only for first feature group in component

as.data.frame(fList) # classic data.frame format, works for all objects
as.data.table(fGroups) # return non-averaged intensities (default)
as.data.table(fGroups, features = TRUE) # also include feature information
as.data.table(fGroups, average = "fGroups") # output a simple data with feature group-averaged intensities
as.data.table(fGroups, average = "fGroups",
              features = TRUE) # include averaged/collapsed feature data


as.data.table(mslists, averaged = FALSE) # peak lists for each feature
as.data.table(mslists, fGroups = fGroups) # add feature group information

as.data.table(formulas, countElements = c("C", "H")) # include C/H counts (e.g. for van Krevelen plots)
# add various information for organic matter characterization (common elemental
# counts/ratios, classifications etc)
as.data.table(formulas, OM = TRUE)

as.data.table(compounds, fGroups = fGroups) # add feature group information
as.data.table(compounds, fragments = TRUE) # include information of all annotated fragments

annotatedPeakList(formulas, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, onlyAnnotated = TRUE) # only include annotated peaks
annotatedPeakList(compounds, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, formulas = formulas) # include formula annotations