5.1 Inspecting results
Several generic functions exist that can be used to inspect data that is stored in a particular object (e.g. features, compounds etc):
| Generic | Classes | Remarks |
|---|---|---|
length() |
All | Returns the length of the object (e.g. number of features, compounds etc) |
algorithm() |
All | Returns the name of the algorithm used to generate the object. |
groupNames() |
All | Returns all the unique identitifiers (or names) of the feature groups for which this object contains results. |
names() |
featureGroups, components |
Returns names of the feature groups (similar to groupNames()) or components |
show() |
All | Prints general information. |
"[[" / "$" operators |
All | Extract general information, see below. |
as.data.table() / as.data.frame() |
All | Convert data to a data.table or data.frame, see below. |
analysisInfo(), analyses(), replicates() |
features, featureGroups |
Returns the analysis information, analyses or replicates for which this object contains data. |
groupInfo() |
featureGroups |
Returns feature group information (m/z and retention time values). |
screenInfo() |
featureGroupsScreening |
Returns information on hits from suspect screening. |
componentInfo() |
components |
Returns information for all components. |
annotatedPeakList() |
formulas, compounds |
Returns a table with annotated mass peaks (see below). |
The common R extraction operators "[[", "$" are used to obtain data for a particular feature groups, analysis etc:
#> ID ret mz area intensity retmin
#> <char> <num> <num> <num> <num> <num>
#> 1: f_16347712658720538540 14.139 98.97533 4778875.00 330176 5.945
#> 2: f_10395879499927939455 4.750 98.97542 96640.85 125856 4.351
#> 3: f_453007312256804411 7.144 100.11199 470442.00 283356 5.945
#> 4: f_17185811996419289811 28.127 100.11208 5225358.00 304644 11.141
#> 5: f_6923213961671746702 4.550 100.11208 168080.50 76724 1.961
#> ---
#> 543: f_2689651329902935219 383.363 415.21303 364565.80 135352 381.122
#> 544: f_7335751044087049528 9.143 425.15511 415152.60 121928 8.143
#> 545: f_14461495817845968238 319.805 425.18879 732124.40 210844 317.060
#> 546: f_3994796688206637648 10.142 427.03246 365056.90 114896 8.143
#> 547: f_3073950847809437369 9.143 433.00456 3165097.00 946000 8.143
#> [1] 264836 245372 216560
#> [1] 264836
#> ID mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor
#> <int> <num> <num> <num> <num> <num> <num> <lgcl>
#> 1: 5 105.0698 6183.111 1 3 1 2.666667 FALSE
#> 2: 6 106.0652 7643.556 1 3 1 2.666667 FALSE
#> 3: 8 107.0728 7760.667 1 3 1 2.666667 FALSE
#> 4: 15 120.0556 168522.672 1 3 1 2.666667 TRUE
#> 5: 17 121.0587 13894.667 1 3 1 2.666667 FALSE
#> 6: 18 121.0883 10032.888 1 3 1 2.666667 FALSE
#> 7: 19 122.0964 147667.766 1 3 1 2.666667 FALSE
#> 8: 20 123.0803 36631.109 1 3 1 2.666667 FALSE
#> 9: 21 123.0996 15482.445 1 3 1 2.666667 FALSE
#> 10: 22 124.0806 35580.668 1 3 1 2.666667 FALSE
#> neutral_formula ion_formula neutralMass ion_formula_mz error dbe isoScore
#> <char> <char> <num> <num> <num> <num> <num>
#> 1: C6H5N3 C6H6N3 119.0483 120.0556 1.566667 6 0.92461
#> explainedPeaks score neutralMass SMILES
#> <int> <num> <num> <char>
#> 1: 0 3.0000000 119.0483 C1=CC2=NNN=C2C=C1
#> 2: 0 0.7542007 119.0483 C1=CC2=C(N=C1)N=CN2
#> 3: 0 0.4403258 119.0483 C1=CNC2=CN=CN=C21
#> 4: 0 0.3780081 119.0483 C1=CC2=C(C=NN2)N=C1
#> 5: 0 0.3366106 119.0483 C1=CN2C(=CC=N2)N=C1
#> ---
#> 37: 0 0.1259247 119.0483 C1=CN=CN=C1CC#N
#> 38: 0 0.1258830 119.0483 CC1=CN=CC(=N1)C#N
#> 39: 0 0.1250904 119.0483 CC1=CN=CN=C1C#N
#> 40: 0 0.1250209 119.0483 C#CC1=NC=CN=C1N
#> 41: 0 0.1250000 119.0483 C1=CC(=[N+]=[N-])C=CC1=N
#> group ret mz isogroup isonr charge
#> <char> <num> <num> <num> <num> <num>
#> 1: M143_R206_64 205.787 143.0700 NA NA NA
#> 2: M159_R208_103 208.280 159.0650 NA NA NA
#> 3: M161_R208_104 207.582 161.0806 NA NA NA
#> 4: M181_R209_159 208.580 181.0469 NA NA NA
A more sophisticated way to obtain data from a workflow object is to use as.data.table() or as.data.frame(). These functions will convert all information within the object to a table (data.table or data.frame) and allow various options to add extra information. An advantage is that this common data format can be used with many other functions within R. The output is in a tidy format.
NOTE If you are not familiar with
data.tableand want to know more see data.table. Briefly, this is a more efficient and largely compatible alternative to the regulardata.frame.
NOTE The
as.data.frame()methods defined inpatRoonsimply convert the results fromas.data.table(), hence, both functions are equal in their usage and are defined for the same object classes.
Some typical examples are shown below.
#> analysis ID ret mz area intensity
#> <char> <char> <num> <num> <num> <num>
#> 1: solvent-pos-1 f_12860273722894428192 13.176 98.97537 4345232.0 391476
#> 2: solvent-pos-1 f_9596132961704643617 7.181 100.11197 797112.1 426956
#> 3: solvent-pos-1 f_6371335420420248621 192.178 100.11211 9609998.0 750532
#> 4: solvent-pos-1 f_6506496206746423615 19.171 100.11217 5784411.0 370376
#> 5: solvent-pos-1 f_5121157124211719533 4.786 100.11220 551723.6 567312
#> ---
#> 2922: standard-pos-3 f_2042036435154018996 318.892 425.18866 666531.5 232636
#> 2923: standard-pos-3 f_10839998681702082513 9.114 427.03242 362024.1 114744
#> 2924: standard-pos-3 f_15164217360460697802 318.892 427.18678 200193.5 77768
#> 2925: standard-pos-3 f_8695446765189507635 382.682 432.23984 217612.9 97648
#> 2926: standard-pos-3 f_4776065115895602396 9.114 433.00457 3086864.0 912920
# Returns group info and intensity values for each feature group
as.data.table(fGroups, average = TRUE) # average intensities for replicates#> group ret mz standard-pos_intensity
#> <char> <num> <num> <num>
#> 1: M109_R192_20 191.8717 109.0759 183482.67
#> 2: M111_R330_23 330.4078 111.0439 84598.67
#> 3: M114_R269_25 268.6906 114.0912 85796.00
#> 4: M116_R317_29 316.7334 116.0527 766888.00
#> 5: M120_R268_30 268.4078 120.0554 242256.00
#> ---
#> 137: M316_R363_635 363.4879 316.1741 89904.00
#> 138: M318_R349_638 349.1072 318.1450 83320.00
#> 139: M352_R335_664 334.9403 352.2019 74986.67
#> 140: M407_R239_672 239.3567 407.2227 186568.00
#> 141: M425_R319_676 319.4944 425.1885 214990.67
# As above, but with suspect matches on separate rows and additional screening information
# (select some columns to simplify the output below)
as.data.table(fGroupsSusp, average = TRUE, collapseSuspects = NULL,
onlyHits = TRUE)[, c("group", "susp_name", "susp_compRank", "susp_annSimComp", "susp_estIDLevel")]#> group susp_name susp_compRank susp_annSimComp susp_estIDLevel
#> <char> <char> <int> <num> <char>
#> 1: M120_R268_30 1H-benzotriazole 1 0.0000000 4b
#> 2: M137_R249_53 N-Phenyl urea 1 0.6443557 3a
#> 3: M146_R309_68 2-Hydroxyquinoline 2 0.9896892 3a
#> 4: M146_R248_69 2-Hydroxyquinoline NA NA 5
#> 5: M146_R225_70 2-Hydroxyquinoline NA NA 5
#> group type ID mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor
#> <char> <char> <int> <num> <num> <num> <num> <num> <num> <lgcl>
#> 1: M120_R268_30 MS 1 100.1120 178952.38 1.0000000 3 1.0000000 7.666667 FALSE
#> 2: M120_R268_30 MS 2 102.1277 202359.67 1.0000000 3 1.0000000 7.666667 FALSE
#> 3: M120_R268_30 MS 3 114.0912 37647.55 1.0000000 3 0.5654762 4.333333 FALSE
#> 4: M120_R268_30 MS 4 115.0752 66685.24 1.0000000 3 1.0000000 7.666667 FALSE
#> 5: M120_R268_30 MS 5 120.0555 113335.85 1.0000000 3 1.0000000 7.666667 TRUE
#> ---
#> 236: M192_R355_191 MS 52 298.1328 16943.31 0.6666667 2 0.2545455 2.666667 FALSE
#> 237: M192_R355_191 MS 53 299.1274 45880.92 1.0000000 3 0.4060606 4.333333 FALSE
#> 238: M192_R355_191 MSMS 14 119.0496 588372.44 1.0000000 3 1.0000000 3.000000 FALSE
#> 239: M192_R355_191 MSMS 19 120.0524 70273.34 1.0000000 3 1.0000000 3.000000 FALSE
#> 240: M192_R355_191 MSMS 32 192.1383 71978.66 1.0000000 3 1.0000000 3.000000 TRUE
# Returns all formula candidates for each feature group with scoring
# information, neutral loss etc
as.data.table(formulas)[, 1:6]#> group neutral_formula ion_formula neutralMass ion_formula_mz error
#> <char> <char> <char> <num> <num> <num>
#> 1: M120_R268_30 C6H5N3 C6H6N3 119.0483 120.0556 1.566667
#> 2: M137_R249_53 C7H8N2O C7H9N2O 136.0637 137.0709 2.400000
#> 3: M146_R309_68 C9H7NO C9H8NO 145.0528 146.0600 1.400000
#> 4: M192_R355_191 C12H17NO C12H18NO 191.1310 192.1383 -1.966667
# Returns all compound candidates for each feature group with scoring and other metadata
as.data.table(compounds)[, 1:4]#> group explainedPeaks score neutralMass
#> <char> <int> <num> <num>
#> 1: M120_R268_30 0 3.0000000 119.0483
#> 2: M120_R268_30 0 0.7542007 119.0483
#> 3: M120_R268_30 0 0.4403258 119.0483
#> 4: M120_R268_30 0 0.3780081 119.0483
#> 5: M120_R268_30 0 0.3366106 119.0483
#> ---
#> 293: M192_R355_191 1 0.7481879 191.1310
#> 294: M192_R355_191 1 0.7443703 191.1310
#> 295: M192_R355_191 1 0.7442086 191.1310
#> 296: M192_R355_191 1 0.7437880 191.1310
#> 297: M192_R355_191 1 0.7437233 191.1310
# Returns table with all components (including feature group info, annotations etc)
as.data.table(components)[, 1:6]#> name cmp_ret cmp_retsd neutral_mass analysis size
#> <char> <num> <num> <char> <char> <int>
#> 1: CMP1 347.2914 0.0000000 <NA> standard-pos-2 2
#> 2: CMP1 347.2914 0.0000000 <NA> standard-pos-2 2
#> 3: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> 4: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> 5: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> ---
#> 88: CMP29 313.3475 0.3105035 <NA> standard-pos-2 3
#> 89: CMP29 313.3475 0.3105035 <NA> standard-pos-2 3
#> 90: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
#> 91: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
#> 92: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
Finally, the annotatedPeakList() function is useful to inspect annotation results for a formula or compound candidate:
# formula annotations for the first formula candidate of feature group M137_R249_53
annotatedPeakList(formulas, index = 1, groupName = "M137_R249_53",
MSPeakLists = mslists)#> ID mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor ion_formula dbe ion_formula_mz error neutral_loss annotated
#> <int> <num> <num> <num> <num> <num> <num> <lgcl> <char> <num> <num> <num> <char> <lgcl>
#> 1: 2 94.06500 9406.110 1 3 1 3.333333 FALSE C6H8N 3.5 94.06513 1.3000000 CHNO TRUE
#> 2: 6 98.97521 2212.000 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 3: 7 105.06971 1662.111 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 4: 14 120.04435 7176.222 1 3 1 3.333333 FALSE C7H6NO 5.5 120.04439 0.3666667 H3N TRUE
#> 5: 19 122.07218 2246.000 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 6: 21 135.08005 1565.556 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 7: 23 137.07040 5348.667 1 3 1 3.333333 TRUE C7H9N2O 4.5 137.07094 3.2500000 TRUE
#> 8: 24 137.09570 2026.889 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 9: 26 138.09116 12356.667 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
#> 10: 27 139.07501 5020.667 1 3 1 3.333333 FALSE <NA> NA NA NA <NA> FALSE
# compound annotation for first candidate of feature group M137_R249_53
annotatedPeakList(compounds, index = 1, groupName = "M137_R249_53",
MSPeakLists = mslists)#> ID mz intensity fgroup_abundance_rel fgroup_abundance_abs feat_abundance_rel feat_abundance_abs precursor ion_formula ion_formula_MF neutral_loss score annotated
#> <int> <num> <num> <num> <num> <num> <num> <lgcl> <char> <char> <char> <num> <lgcl>
#> 1: 2 94.06500 9406.110 1 3 1 3.333333 FALSE C6H8N [C6H6N+H]+H+ CHNO 405 TRUE
#> 2: 6 98.97521 2212.000 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 3: 7 105.06971 1662.111 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 4: 14 120.04435 7176.222 1 3 1 3.333333 FALSE C7H6NO [C7H6NO]+ H3N 305 TRUE
#> 5: 19 122.07218 2246.000 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 6: 21 135.08005 1565.556 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 7: 23 137.07040 5348.667 1 3 1 3.333333 TRUE <NA> <NA> <NA> NA FALSE
#> 8: 24 137.09570 2026.889 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 9: 26 138.09116 12356.667 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
#> 10: 27 139.07501 5020.667 1 3 1 3.333333 FALSE <NA> <NA> <NA> NA FALSE
More advanced examples for these functions are shown below.
# Feature table, can also be accessed by numeric index
fList[[1]]
mslists[["standard-pos-1", "M120_R268_30"]] # feature data (instead of feature group averaged)
formulas[[1, "M120_R268_30"]] # feature data (if available, i.e. calculateFeatures=TRUE)
components[["CMP1", 1]] # only for first feature group in component
as.data.frame(fList) # classic data.frame format, works for all objects
as.data.table(fGroups) # return non-averaged intensities (default)
as.data.table(fGroups, features = TRUE) # also include feature information
as.data.table(fGroups, average = "fGroups") # output a simple data with feature group-averaged intensities
as.data.table(fGroups, average = "fGroups",
features = TRUE) # include averaged/collapsed feature data
as.data.table(mslists, averaged = FALSE) # peak lists for each feature
as.data.table(mslists, fGroups = fGroups) # add feature group information
as.data.table(formulas, countElements = c("C", "H")) # include C/H counts (e.g. for van Krevelen plots)
# add various information for organic matter characterization (common elemental
# counts/ratios, classifications etc)
as.data.table(formulas, OM = TRUE)
as.data.table(compounds, fGroups = fGroups) # add feature group information
as.data.table(compounds, fragments = TRUE) # include information of all annotated fragments
annotatedPeakList(formulas, index = 1, groupName = "M120_R268_30",
MSPeakLists = mslists, onlyAnnotated = TRUE) # only include annotated peaks
annotatedPeakList(compounds, index = 1, groupName = "M120_R268_30",
MSPeakLists = mslists, formulas = formulas) # include formula annotations