5.1 Inspecting results

Several generic functions exist that can be used to inspect data that is stored in a particular object (e.g. features, compounds etc):

Generic	Classes	Remarks
`length()`	All	Returns the length of the object (e.g. number of features, compounds etc)
`algorithm()`	All	Returns the name of the algorithm used to generate the object.
`groupNames()`	All	Returns all the unique identitifiers (or names) of the feature groups for which this object contains results.
`names()`	`featureGroups`, `components`	Returns names of the feature groups (similar to `groupNames()`) or components
`show()`	All	Prints general information.
`"[["` / `"$"` operators	All	Extract general information, see below.
`as.data.table()` / `as.data.frame()`	All	Convert data to a `data.table` or `data.frame`, see below.
`analysisInfo()`, `analyses()`, `replicateGroups()`	`features`, `featureGroups`	Returns the analysis information, analyses or replicate groups for which this object contains data.
`groupInfo()`	`featureGroups`	Returns feature group information (m/z and retention time values).
`screenInfo()`	`featureGroupsScreening`	Returns information on hits from suspect screening.
`componentInfo()`	`components`	Returns information for all components.
`annotatedPeakList()`	`formulas`, `compounds`	Returns a table with annotated mass peaks (see below).

The common R extraction operators "[[", "$" can be used to obtain data for a particular feature groups, analysis etc:

# Feature table (only first columns for readability)
fList[["standard-1"]][, 1:6]

#> NULL

# Feature group intensities
fGroups$M120_R268_30

#> [1] 264836 245372 216560

fGroups[[1, "M120_R268_30"]] # only first analysis

#> [1] 264836

# obtains MS/MS peak list  (feature group averaged data)
mslists[["M120_R268_30"]]$MSMS

#>        ID       mz  intensity precursor
#>     <int>    <num>      <num>    <lgcl>
#>  1:     5 105.0698   6183.111     FALSE
#>  2:     6 106.0653   7643.556     FALSE
#>  3:     8 107.0728   7760.667     FALSE
#>  4:    15 120.0556 168522.667      TRUE
#>  5:    17 121.0587  13894.667     FALSE
#>  6:    18 121.0884  10032.889     FALSE
#>  7:    19 122.0964 147667.778     FALSE
#>  8:    20 123.0803  36631.111     FALSE
#>  9:    21 123.0996  15482.444     FALSE
#> 10:    22 124.0805  35580.667     FALSE

# get all formula candidates for a feature group
formulas[["M120_R268_30"]][, 1:7]

#>    neutral_formula ion_formula neutralMass ion_formula_mz error   dbe isoScore
#>             <char>      <char>       <num>          <num> <num> <num>    <num>
#> 1:          C6H5N3      C6H6N3    119.0483       120.0556   1.8     6  0.92461

# get all compound candidates for a feature group
compounds[["M120_R268_30"]][, 1:4]

#>     explainedPeaks     score neutralMass                   SMILES
#>              <int>     <num>       <num>                   <char>
#>  1:              0 2.9919045    119.0483        C1=CC2=NNN=C2C=C1
#>  2:              0 1.2504308    119.0483        C1=CNC2=CN=CN=C21
#>  3:              0 1.2336169    119.0483      C1=CC2=C(N=C1)N=CN2
#>  4:              0 1.2079701    119.0483      C1=CC2=C(C=NN2)N=C1
#>  5:              0 1.1511570    119.0483      C1=CN2C(=CC=N2)N=C1
#> ---                                                              
#> 37:              0 0.9541662    119.0483        CC1=CN=C(N=C1)C#N
#> 38:              0 0.9535093    119.0483        CC1=NC(=NC=C1)C#N
#> 39:              0 0.9499092    119.0483        CC1=NN=C(C=C1)C#N
#> 40:              0 0.8128595    119.0483 C1=CC(=[N+]=[N-])C=CC1=N
#> 41:              0 0.7438038    119.0483         C(C#N)C(CC#N)C#N

# get a table with information of a component
components[["CMP7"]][, 1:6]

#>            group     ret       mz isogroup isonr charge
#>           <char>   <num>    <num>    <num> <num>  <num>
#> 1:  M143_R206_64 205.787 143.0700       NA    NA     NA
#> 2: M159_R208_103 208.280 159.0650       NA    NA     NA
#> 3: M161_R208_104 207.582 161.0806       NA    NA     NA
#> 4: M181_R209_159 208.580 181.0469       NA    NA     NA

A more sophisticated way to obtain data from a workflow object is to use as.data.table() or as.data.frame(). These functions will convert all information within the object to a table (data.table or data.frame) and allow various options to add extra information. An advantage is that this common data format can be used with many other functions within R. The output is in a tidy format.

NOTE If you are not familiar with data.table and want to know more see data.table. Briefly, this is a more efficient and largely compatible alternative to the regular data.frame.

NOTE The as.data.frame() methods defined in patRoon simply convert the results from as.data.table(), hence, both functions are equal in their usage and are defined for the same object classes.

Some typical examples are shown below.

# obtain table with all features (only first columns for readability)
as.data.table(fList)[, 1:6]

#>             analysis                     ID     ret        mz      area intensity
#>               <char>                 <char>   <num>     <num>     <num>     <num>
#>    1:  solvent-pos-1  f_1111264393868911102  13.176  98.97537 4345232.0    391476
#>    2:  solvent-pos-1  f_2094520183844760528   7.181 100.11197  797112.1    426956
#>    3:  solvent-pos-1  f_7941927790353320269 192.178 100.11211 9609998.0    750532
#>    4:  solvent-pos-1 f_16909335299782523620  19.171 100.11217 5784411.0    370376
#>    5:  solvent-pos-1  f_6221034045034627155   4.786 100.11220  551723.6    567312
#>   ---                                                                            
#> 2922: standard-pos-3 f_11654918892462341096 318.892 425.18866  666531.5    232636
#> 2923: standard-pos-3 f_15952415959529491179   9.114 427.03242  362024.1    114744
#> 2924: standard-pos-3 f_12161494855555353140 318.892 427.18678  200193.5     77768
#> 2925: standard-pos-3 f_13430643112055510340 382.682 432.23984  217612.9     97648
#> 2926: standard-pos-3 f_12072347617435122911   9.114 433.00457 3086864.0    912920

# Returns group info and intensity values for each feature group
as.data.table(fGroups, average = TRUE) # average intensities for replicates

#>              group      ret       mz standard-pos
#>             <char>    <num>    <num>        <num>
#>   1:  M109_R192_20 191.8717 109.0759    183482.67
#>   2:  M111_R330_23 330.4078 111.0439     84598.67
#>   3:  M114_R269_25 268.6906 114.0912     85796.00
#>   4:  M116_R317_29 316.7334 116.0527    766888.00
#>   5:  M120_R268_30 268.4078 120.0554    242256.00
#>  ---                                             
#> 137: M316_R363_635 363.4879 316.1741     89904.00
#> 138: M318_R349_638 349.1072 318.1450     83320.00
#> 139: M352_R335_664 334.9403 352.2019     74986.67
#> 140: M407_R239_672 239.3567 407.2227    186568.00
#> 141: M425_R319_676 319.4944 425.1885    214990.67

# As above, but with suspect matches on separate rows and additional screening information
# (select some columns to simplify the output below)
as.data.table(fGroupsSusp, average = TRUE, collapseSuspects = NULL,
              onlyHits = TRUE)[, c("group", "susp_name", "susp_compRank", "susp_annSimBoth", "susp_estIDLevel")]

#>           group          susp_name susp_compRank susp_annSimBoth susp_estIDLevel
#>          <char>             <char>         <int>           <num>          <char>
#> 1: M120_R268_30   1H-benzotriazole             1       0.0000000              4b
#> 2: M137_R249_53      N-Phenyl urea             1       0.6443557              3a
#> 3: M146_R309_68 2-Hydroxyquinoline             2       0.9896892              3a
#> 4: M146_R248_69 2-Hydroxyquinoline            NA              NA               5
#> 5: M146_R225_70 2-Hydroxyquinoline            NA              NA               5

# Returns all peak lists for each feature group
as.data.table(mslists)

#>              group   type    ID       mz  intensity precursor
#>             <char> <char> <int>    <num>      <num>    <lgcl>
#>   1:  M120_R268_30     MS     1 100.1120 178952.381     FALSE
#>   2:  M120_R268_30     MS     2 102.1277 202359.667     FALSE
#>   3:  M120_R268_30     MS     3 114.0912  37647.548     FALSE
#>   4:  M120_R268_30     MS     4 115.0752  66685.238     FALSE
#>   5:  M120_R268_30     MS     5 120.0554 113335.857      TRUE
#>  ---                                                         
#> 235: M192_R355_191     MS    51 299.1274  44083.126     FALSE
#> 236: M192_R355_191     MS    52 299.1471   7390.267     FALSE
#> 237: M192_R355_191   MSMS    14 119.0496 588372.444     FALSE
#> 238: M192_R355_191   MSMS    18 120.0524  70273.333     FALSE
#> 239: M192_R355_191   MSMS    31 192.1384  71978.667      TRUE

# Returns all formula candidates for each feature group with scoring
# information, neutral loss etc
as.data.table(formulas)[, 1:6]

#>            group neutral_formula ion_formula neutralMass ion_formula_mz      error
#>           <char>          <char>      <char>       <num>          <num>      <num>
#> 1:  M120_R268_30          C6H5N3      C6H6N3    119.0483       120.0556 1.80000000
#> 2:  M137_R249_53         C7H8N2O     C7H9N2O    136.0637       137.0709 2.90000000
#> 3:  M146_R309_68          C9H7NO      C9H8NO    145.0528       146.0600 1.66666667
#> 4: M192_R355_191        C12H17NO    C12H18NO    191.1310       192.1383 0.03333333

# Returns all compound candidates for each feature group with scoring and other metadata
as.data.table(compounds)[, 1:4]

#>              group explainedPeaks    score neutralMass
#>             <char>          <int>    <num>       <num>
#>   1:  M120_R268_30              0 2.991905    119.0483
#>   2:  M120_R268_30              0 1.250431    119.0483
#>   3:  M120_R268_30              0 1.233617    119.0483
#>   4:  M120_R268_30              0 1.207970    119.0483
#>   5:  M120_R268_30              0 1.151157    119.0483
#>  ---                                                  
#> 288: M192_R355_191              1 1.367332    191.1310
#> 289: M192_R355_191              1 1.367220    191.1310
#> 290: M192_R355_191              1 1.366424    191.1310
#> 291: M192_R355_191              1 1.364403    191.1310
#> 292: M192_R355_191              1 1.363116    191.1310

# Returns table with all components (including feature group info, annotations etc)
as.data.table(components)[, 1:6]

#>       name  cmp_ret cmp_retsd       neutral_mass       analysis  size
#>     <char>    <num>     <num>             <char>         <char> <int>
#>  1:   CMP1 347.2914 0.0000000               <NA> standard-pos-2     2
#>  2:   CMP1 347.2914 0.0000000               <NA> standard-pos-2     2
#>  3:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#>  4:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#>  5:   CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3     6
#> ---                                                                  
#> 88:  CMP29 313.3475 0.3105035               <NA> standard-pos-2     3
#> 89:  CMP29 313.3475 0.3105035               <NA> standard-pos-2     3
#> 90:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3
#> 91:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3
#> 92:  CMP30 268.3430 0.3840764           81.08705 standard-pos-1     3

Finally, the annotatedPeakList() function is useful to inspect annotation results for a formula or compound candidate:

# formula annotations for the first formula candidate of feature group M137_R249_53
annotatedPeakList(formulas, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)

#>        ID        mz intensity precursor ion_formula   dbe ion_formula_mz error neutral_loss annotated
#>     <int>     <num>     <num>    <lgcl>      <char> <num>          <num> <num>       <char>    <lgcl>
#>  1:     2  94.06500  9406.111     FALSE       C6H8N   3.5       94.06513  1.30         CHNO      TRUE
#>  2:     6  98.97522  2212.000     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#>  3:     7 105.06971  1662.111     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#>  4:    14 120.04434  7176.222     FALSE      C7H6NO   5.5      120.04439  0.40          H3N      TRUE
#>  5:    19 122.07222  2246.000     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#>  6:    21 135.08004  1565.556     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#>  7:    23 137.07039  5348.667      TRUE     C7H9N2O   4.5      137.07094  3.35                   TRUE
#>  8:    24 137.09572  2026.889     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#>  9:    26 138.09116 12356.667     FALSE        <NA>    NA             NA    NA         <NA>     FALSE
#> 10:    27 139.07503  5020.667     FALSE        <NA>    NA             NA    NA         <NA>     FALSE

# compound annotation for first candidate of feature group M137_R249_53
annotatedPeakList(compounds, index = 1, groupName = "M137_R249_53",
                  MSPeakLists = mslists)

#>        ID        mz intensity precursor ion_formula ion_formula_MF neutral_loss score annotated
#>     <int>     <num>     <num>    <lgcl>      <char>         <char>       <char> <num>    <lgcl>
#>  1:     2  94.06500  9406.111     FALSE       C6H8N   [C6H6N+H]+H+         CHNO   405      TRUE
#>  2:     6  98.97522  2212.000     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  3:     7 105.06971  1662.111     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  4:    14 120.04434  7176.222     FALSE      C7H6NO      [C7H6NO]+          H3N   305      TRUE
#>  5:    19 122.07222  2246.000     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  6:    21 135.08004  1565.556     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  7:    23 137.07039  5348.667      TRUE        <NA>           <NA>         <NA>    NA     FALSE
#>  8:    24 137.09572  2026.889     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#>  9:    26 138.09116 12356.667     FALSE        <NA>           <NA>         <NA>    NA     FALSE
#> 10:    27 139.07503  5020.667     FALSE        <NA>           <NA>         <NA>    NA     FALSE

More advanced examples for these functions are shown below.

# Feature table, can also be accessed by numeric index
fList[[1]]
mslists[["standard-1", "M120_R268_30"]] # feature data (instead of feature group averaged)
formulas[[1, "M120_R268_30"]] # feature data (if available, i.e. calculateFeatures=TRUE)
components[["CMP1", 1]] # only for first feature group in component

as.data.frame(fList) # classic data.frame format, works for all objects
as.data.table(fGroups) # return non-averaged intensities (default)
as.data.table(fGroups, features = TRUE) # include feature information
as.data.table(mslists, averaged = FALSE) # peak lists for each feature
as.data.table(mslists, fGroups = fGroups) # add feature group information

as.data.table(formulas, countElements = c("C", "H")) # include C/H counts (e.g. for van Krevelen plots)
# add various information for organic matter characterization (common elemental
# counts/ratios, classifications etc)
as.data.table(formulas, OM = TRUE)

as.data.table(compounds, fGroups = fGroups) # add feature group information
as.data.table(compounds, fragments = TRUE) # include information of all annotated fragments

annotatedPeakList(formulas, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, onlyAnnotated = TRUE) # only include annotated peaks
annotatedPeakList(compounds, index = 1, groupName = "M120_R268_30",
                  MSPeakLists = mslists, formulas = formulas) # include formula annotations