6.4 Processing data
All data objects that are generated during a sets workflow inherit from the classes from a ‘regular’ workflow. This means that, with some minor exceptions, all of the data processing functionality discussed in the previous chapter (e.g. subsetting, inspection, filtering, plotting, reporting) is also applicable to a sets workflow. In addition, data from sets workflows also bring some additional data processing functionality. Some examples:
# only keep feature groups that have positive data
fGroupsPos <- fGroups[, sets = "positive"]
# only keep feature groups with features present in all sets
fGroupsF <- filter(fGroups, relMinSets = 1)#> Applying minimum sets filter... Done! Filtered 3407 (88.84%) features and 809 (94.07%) feature groups. Remaining: 428 features in 51 groups.
# In sets workflows, the m/z values of features are 'neutralized', the `ion_mz` columns contains the original 'ionized' m/z values.
as.data.table(fGroups)[1:5, c("group", "mz", "ion_mz-positive", "ion_mz-negative")]#> group mz ion_mz-positive ion_mz-negative
#> <char> <num> <num> <num>
#> 1: M98_R7_1 97.96702 NA 96.95974
#> 2: M98_R30_2 97.96708 NA 96.95981
#> 3: M98_R10_3 97.96709 NA 96.95981
#> 4: M98_R5_4 97.96769 98.97497 96.96042
#> 5: M98_R14_5 97.96787 98.97515 96.96060
# Inspect set specific data.
as.data.table(compounds)[1:5, c("group", "score-positive", "score-negative", "compoundName", "set")]#> group score-positive score-negative compoundName set
#> <char> <num> <num> <char> <char>
#> 1: M198_R317_273 3.5190115 4.569478 3-(4-chlorophenyl)-1,1-dimethylurea positive,negative
#> 2: M198_R317_273 2.5198763 1.563191 5-[[(2R)-azetidin-2-yl]methoxy]-2-chloropyridine positive,negative
#> 3: M198_R317_273 1.2528529 1.350556 1-(3-chloro-4-methylphenyl)-3-methylurea positive,negative
#> 4: M198_R317_273 1.1469202 1.276057 3-(3-chlorophenyl)-1,1-dimethylurea positive,negative
#> 5: M198_R317_273 0.9981602 1.127297 1-(4-chlorophenyl)-3-ethylurea positive,negative
In sets workflows the analysis information is amended with a set column to specify the set each analysis belongs to. Just like other columns in the analysis information, the set column can be used to group and aggregate data:
# only keep feature groups with features present in both polarities
fGroupsPosNeg <- overlap(fGroups, which = c("positive", "negative"), aggregate = "set")
# only keep feature groups with features that are present only in positive mode
fGroupsOnlyPos <- unique(fGroups, which = "positive", aggregate = "set")
plotVenn(fGroups, aggregate = "set", margin = 0.1) # compare positive/negative features
plotChord(fGroups, aggregate = TRUE, groupBy = "set") # compare replicate aggregated positive/negative features
# plot annotated positive/negative mirror spectrum
plotSpectrum(compounds, index = 1, groupName = "M198_R317_273", MSPeakLists = mslists,
plotStruct = TRUE)
The reference manual for the workflow objects contains specific notes applicable to sets workflows (?featureGroups, ?compounds etc).