5.1 Inspecting results
Several generic functions exist that can be used to inspect data that is stored in a particular object (e.g. features, compounds etc):
Generic | Classes | Remarks |
---|---|---|
length() |
All | Returns the length of the object (e.g. number of features, compounds etc) |
algorithm() |
All | Returns the name of the algorithm used to generate the object. |
groupNames() |
All | Returns all the unique identitifiers (or names) of the feature groups for which this object contains results. |
names() |
featureGroups , components |
Returns names of the feature groups (similar to groupNames() ) or components |
show() |
All | Prints general information. |
"[[" / "$" operators |
All | Extract general information, see below. |
as.data.table() / as.data.frame() |
All | Convert data to a data.table or data.frame , see below. |
analysisInfo() , analyses() , replicateGroups() |
features , featureGroups |
Returns the analysis information, analyses or replicate groups for which this object contains data. |
groupInfo() |
featureGroups |
Returns feature group information (m/z and retention time values). |
screenInfo() |
featureGroupsScreening |
Returns information on hits from suspect screening. |
componentInfo() |
components |
Returns information for all components. |
annotatedPeakList() |
formulas , compounds |
Returns a table with annotated mass peaks (see below). |
The common R
extraction operators "[["
, "$"
can be used to obtain data for a particular feature groups, analysis etc:
#> NULL
#> [1] 264836 245372 216560
#> [1] 264836
#> ID mz intensity precursor
#> <int> <num> <num> <lgcl>
#> 1: 5 105.0698 6183.111 FALSE
#> 2: 6 106.0653 7643.556 FALSE
#> 3: 8 107.0728 7760.667 FALSE
#> 4: 15 120.0556 168522.667 TRUE
#> 5: 17 121.0587 13894.667 FALSE
#> 6: 18 121.0884 10032.889 FALSE
#> 7: 19 122.0964 147667.778 FALSE
#> 8: 20 123.0803 36631.111 FALSE
#> 9: 21 123.0996 15482.444 FALSE
#> 10: 22 124.0805 35580.667 FALSE
#> neutral_formula ion_formula neutralMass ion_formula_mz error dbe isoScore
#> <char> <char> <num> <num> <num> <num> <num>
#> 1: C6H5N3 C6H6N3 119.0483 120.0556 1.8 6 0.92461
#> explainedPeaks score neutralMass SMILES
#> <int> <num> <num> <char>
#> 1: 0 2.7688810 119.0483 C1=CC2=NNN=C2C=C1
#> 2: 0 1.0370994 119.0483 C1=CC2=C(N=C1)N=CN2
#> 3: 0 1.0280899 119.0483 C1=CNC2=CN=CN=C21
#> 4: 0 1.0013143 119.0483 C1=CN=CC(=C1N)C#N
#> 5: 0 0.9859441 119.0483 C1=CC(=NC=C1N)C#N
#> ---
#> 33: 0 0.7395368 119.0483 C1=CC2=NC=NN2C=C1
#> 34: 0 0.7374733 119.0483 C1=CN2C=NC=C2N=C1
#> 35: 0 0.7353883 119.0483 C1=CNC2=C1C=CN=N2
#> 36: 0 0.5346503 119.0483 C1=CC(=[N+]=[N-])C=CC1=N
#> 37: 0 0.3951816 119.0483 C(C#N)C(CC#N)C#N
#> group ret mz isogroup isonr charge
#> <char> <num> <num> <num> <num> <num>
#> 1: M143_R206_64 205.787 143.0700 NA NA NA
#> 2: M159_R208_103 208.280 159.0650 NA NA NA
#> 3: M161_R208_104 207.582 161.0806 NA NA NA
#> 4: M181_R209_159 208.580 181.0469 NA NA NA
A more sophisticated way to obtain data from a workflow object is to use as.data.table()
or as.data.frame()
. These functions will convert all information within the object to a table (data.table
or data.frame
) and allow various options to add extra information. An advantage is that this common data format can be used with many other functions within R
. The output is in a tidy format.
NOTE If you are not familiar with
data.table
and want to know more see data.table. Briefly, this is a more efficient and largely compatible alternative to the regulardata.frame
.
NOTE The
as.data.frame()
methods defined inpatRoon
simply convert the results fromas.data.table()
, hence, both functions are equal in their usage and are defined for the same object classes.
Some typical examples are shown below.
#> analysis ID ret mz area intensity
#> <char> <char> <num> <num> <num> <num>
#> 1: solvent-pos-1 f_12597898848900737428 13.176 98.97537 4345232.0 391476
#> 2: solvent-pos-1 f_669124120382005239 7.181 100.11197 797112.1 426956
#> 3: solvent-pos-1 f_5360180117729232330 192.178 100.11211 9609998.0 750532
#> 4: solvent-pos-1 f_15509401462529968951 19.171 100.11217 5784411.0 370376
#> 5: solvent-pos-1 f_13135721406520856456 4.786 100.11220 551723.6 567312
#> ---
#> 2922: standard-pos-3 f_13073238589963605900 318.892 425.18866 666531.5 232636
#> 2923: standard-pos-3 f_1101083009791788313 9.114 427.03242 362024.1 114744
#> 2924: standard-pos-3 f_7694769919948470817 318.892 427.18678 200193.5 77768
#> 2925: standard-pos-3 f_7826632710829317400 382.682 432.23984 217612.9 97648
#> 2926: standard-pos-3 f_152174378102100576 9.114 433.00457 3086864.0 912920
# Returns group info and intensity values for each feature group
as.data.table(fGroups, average = TRUE) # average intensities for replicates
#> group ret mz standard-pos
#> <char> <num> <num> <num>
#> 1: M109_R192_20 191.8717 109.0759 183482.67
#> 2: M111_R330_23 330.4078 111.0439 84598.67
#> 3: M114_R269_25 268.6906 114.0912 85796.00
#> 4: M116_R317_29 316.7334 116.0527 766888.00
#> 5: M120_R268_30 268.4078 120.0554 242256.00
#> ---
#> 137: M316_R363_635 363.4879 316.1741 89904.00
#> 138: M318_R349_638 349.1072 318.1450 83320.00
#> 139: M352_R335_664 334.9403 352.2019 74986.67
#> 140: M407_R239_672 239.3567 407.2227 186568.00
#> 141: M425_R319_676 319.4944 425.1885 214990.67
# As above, but with suspect matches on separate rows and additional screening information
# (select some columns to simplify the output below)
as.data.table(fGroupsSusp, average = TRUE, collapseSuspects = NULL,
onlyHits = TRUE)[, c("group", "susp_name", "susp_compRank", "susp_annSimBoth", "susp_estIDLevel")]
#> group susp_name susp_compRank susp_annSimBoth susp_estIDLevel
#> <char> <char> <int> <num> <char>
#> 1: M120_R268_30 1H-benzotriazole 1 0.0000000 4b
#> 2: M137_R249_53 N-Phenyl urea 1 0.6443557 5
#> 3: M146_R309_68 2-Hydroxyquinoline 2 0.9896892 3c
#> 4: M146_R248_69 2-Hydroxyquinoline NA NA 5
#> 5: M146_R225_70 2-Hydroxyquinoline NA NA 5
#> group type ID mz intensity precursor
#> <char> <char> <int> <num> <num> <lgcl>
#> 1: M120_R268_30 MS 1 100.1120 178952.381 FALSE
#> 2: M120_R268_30 MS 2 102.1277 202359.667 FALSE
#> 3: M120_R268_30 MS 3 114.0912 37647.548 FALSE
#> 4: M120_R268_30 MS 4 115.0752 66685.238 FALSE
#> 5: M120_R268_30 MS 5 120.0554 113335.857 TRUE
#> ---
#> 235: M192_R355_191 MS 51 299.1274 44083.126 FALSE
#> 236: M192_R355_191 MS 52 299.1471 7390.267 FALSE
#> 237: M192_R355_191 MSMS 14 119.0496 588372.444 FALSE
#> 238: M192_R355_191 MSMS 18 120.0524 70273.333 FALSE
#> 239: M192_R355_191 MSMS 31 192.1384 71978.667 TRUE
# Returns all formula candidates for each feature group with scoring
# information, neutral loss etc
as.data.table(formulas)[, 1:6]
#> group neutral_formula ion_formula neutralMass ion_formula_mz error
#> <char> <char> <char> <num> <num> <num>
#> 1: M120_R268_30 C6H5N3 C6H6N3 119.0483 120.0556 1.80000000
#> 2: M137_R249_53 C7H8N2O C7H9N2O 136.0637 137.0709 2.90000000
#> 3: M146_R309_68 C9H7NO C9H8NO 145.0528 146.0600 1.66666667
#> 4: M192_R355_191 C12H17NO C12H18NO 191.1310 192.1383 0.03333333
# Returns all compound candidates for each feature group with scoring and other metadata
as.data.table(compounds)[, 1:4]
#> group explainedPeaks score neutralMass
#> <char> <int> <num> <num>
#> 1: M120_R268_30 0 2.7688810 119.0483
#> 2: M120_R268_30 0 1.0370994 119.0483
#> 3: M120_R268_30 0 1.0280899 119.0483
#> 4: M120_R268_30 0 1.0013143 119.0483
#> 5: M120_R268_30 0 0.9859441 119.0483
#> ---
#> 275: M192_R355_191 0 0.8852525 191.1310
#> 276: M192_R355_191 0 0.8822446 191.1310
#> 277: M192_R355_191 1 0.8814458 191.1310
#> 278: M192_R355_191 0 0.8584452 191.1310
#> 279: M192_R355_191 1 0.7970401 191.1310
# Returns table with all components (including feature group info, annotations etc)
as.data.table(components)[, 1:6]
#> name cmp_ret cmp_retsd neutral_mass analysis size
#> <char> <num> <num> <char> <char> <int>
#> 1: CMP1 347.2914 0.0000000 <NA> standard-pos-2 2
#> 2: CMP1 347.2914 0.0000000 <NA> standard-pos-2 2
#> 3: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> 4: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> 5: CMP2 349.6328 4.6804985 225.1589/188.20157 standard-pos-3 6
#> ---
#> 88: CMP29 313.3475 0.3105035 <NA> standard-pos-2 3
#> 89: CMP29 313.3475 0.3105035 <NA> standard-pos-2 3
#> 90: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
#> 91: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
#> 92: CMP30 268.3430 0.3840764 81.08705 standard-pos-1 3
Finally, the annotatedPeakList()
function is useful to inspect annotation results for a formula or compound candidate:
# formula annotations for the first formula candidate of feature group M137_R249_53
annotatedPeakList(formulas, index = 1, groupName = "M137_R249_53",
MSPeakLists = mslists)
#> ID mz intensity precursor ion_formula dbe ion_formula_mz error neutral_loss annotated
#> <int> <num> <num> <lgcl> <char> <num> <num> <num> <char> <lgcl>
#> 1: 2 94.06500 9406.111 FALSE C6H8N 3.5 94.06513 1.30 CHNO TRUE
#> 2: 6 98.97522 2212.000 FALSE <NA> NA NA NA <NA> FALSE
#> 3: 7 105.06971 1662.111 FALSE <NA> NA NA NA <NA> FALSE
#> 4: 14 120.04434 7176.222 FALSE C7H6NO 5.5 120.04439 0.40 H3N TRUE
#> 5: 19 122.07222 2246.000 FALSE <NA> NA NA NA <NA> FALSE
#> 6: 21 135.08004 1565.556 FALSE <NA> NA NA NA <NA> FALSE
#> 7: 23 137.07039 5348.667 TRUE C7H9N2O 4.5 137.07094 3.35 TRUE
#> 8: 24 137.09572 2026.889 FALSE <NA> NA NA NA <NA> FALSE
#> 9: 26 138.09116 12356.667 FALSE <NA> NA NA NA <NA> FALSE
#> 10: 27 139.07503 5020.667 FALSE <NA> NA NA NA <NA> FALSE
# compound annotation for first candidate of feature group M137_R249_53
annotatedPeakList(compounds, index = 1, groupName = "M137_R249_53",
MSPeakLists = mslists)
#> ID mz intensity precursor ion_formula ion_formula_MF neutral_loss score annotated
#> <int> <num> <num> <lgcl> <char> <char> <char> <num> <lgcl>
#> 1: 2 94.06500 9406.111 FALSE C6H8N [C6H6N+H]+H+ CHNO 405 TRUE
#> 2: 6 98.97522 2212.000 FALSE <NA> <NA> <NA> NA FALSE
#> 3: 7 105.06971 1662.111 FALSE <NA> <NA> <NA> NA FALSE
#> 4: 14 120.04434 7176.222 FALSE C7H6NO [C7H6NO]+ H3N 305 TRUE
#> 5: 19 122.07222 2246.000 FALSE <NA> <NA> <NA> NA FALSE
#> 6: 21 135.08004 1565.556 FALSE <NA> <NA> <NA> NA FALSE
#> 7: 23 137.07039 5348.667 TRUE <NA> <NA> <NA> NA FALSE
#> 8: 24 137.09572 2026.889 FALSE <NA> <NA> <NA> NA FALSE
#> 9: 26 138.09116 12356.667 FALSE <NA> <NA> <NA> NA FALSE
#> 10: 27 139.07503 5020.667 FALSE <NA> <NA> <NA> NA FALSE
More advanced examples for these functions are shown below.
# Feature table, can also be accessed by numeric index
fList[[1]]
mslists[["standard-1", "M120_R268_30"]] # feature data (instead of feature group averaged)
formulas[[1, "M120_R268_30"]] # feature data (if available, i.e. calculateFeatures=TRUE)
components[["CMP1", 1]] # only for first feature group in component
as.data.frame(fList) # classic data.frame format, works for all objects
as.data.table(fGroups) # return non-averaged intensities (default)
as.data.table(fGroups, features = TRUE) # include feature information
as.data.table(mslists, averaged = FALSE) # peak lists for each feature
as.data.table(mslists, fGroups = fGroups) # add feature group information
as.data.table(formulas, countElements = c("C", "H")) # include C/H counts (e.g. for van Krevelen plots)
# add various information for organic matter characterization (common elemental
# counts/ratios, classifications etc)
as.data.table(formulas, OM = TRUE)
as.data.table(compounds, fGroups = fGroups) # add feature group information
as.data.table(compounds, fragments = TRUE) # include information of all annotated fragments
annotatedPeakList(formulas, index = 1, groupName = "M120_R268_30",
MSPeakLists = mslists, onlyAnnotated = TRUE) # only include annotated peaks
annotatedPeakList(compounds, index = 1, groupName = "M120_R268_30",
MSPeakLists = mslists, formulas = formulas) # include formula annotations