5.4 Deleting data

The delete() generic function can be used to manually delete workflow data. This function is used internally within patRoon to implement filtering and subsetting operations, but may also be useful for advanced data processing.

Similar to the subset operator, this function accepts a i, j and additionally k parameter to specify which data should be operated on:

Class Argument i Argument j Argument k
features analysis feature index
featureGroups analysis feature group
featureGroupsScreening analysis feature group suspect (can be NA, imust beNULL)MSPeakLists| feature group | mass peak | analysis (if set, no deletion occurs on group-averaged peak lists)formulas,compounds| feature group | candidate index |components`

If i, j and/or k is not specified (NULL) then data is removed for the complete selection. Some examples are shown below:

# delete 2nd feature in analysis-1
fList <- delete(fList, i = "analysis-1", j = 2)
# delete first ten features in all analyses
fList <- delete(fList, i = NULL, j = 1:10)

# completely remove third+fourth analyses from feature groups
fGroups <- delete(fGroups, i = 3:4)
# delete specific feature group
fGroups <- delete(fGroups, j = "M120_R268_30")
# delete range of feature groups
fGroups <- delete(fGroups, j = 500:750)

# remove all hits for atrazine
fGroupsSusp <- delete(k = "atrazine")
# for a specific feature group
fGroupsSusp <- delete(j = "M120_R268_30", k = "atrazine")
# remove all hits for a feature group
fGroupsSusp <- delete(j = "M120_R268_30", k = NA)

# remove all MS peak lists for a feature group
mslists <- delete(mslists, i = "M120_R268_30")
# removes the first 5 peaks in all peak lists
mslists <- delete(mslists, j = 1:5)
# remove all MS peak lists for a specific analysis
mslists <- delete(mslists, k = "standard-pos-1")

# remove all results for a feature group
formulas <- delete(formulas, i = "M120_R268_30")

# remove top candidate for all feature groups
compounds <- delete(compounds, j = 1)

# remove a component
components <- delete(components, i = "CMP1")
# remove specific feature group from a component
components <- delete(components, i = "CMP1", j = "M120_R268_30")
# remove specific feature group from all components
components <- delete(components, j = "M120_R268_30")

The value set to the j and k (for suspect screening results) can also be a function: the function called repeatedly on parts of the data to select what should be deleted. How the function is called and what it should return depends on the workflow data class:

Class Called on every Arguments Return value
features analysis features (data.table), analysis name Features indices (as integer or logical)
featureGroups, featureGroupsScreening feature group group intensities (vector), feature group name The analyses of the features to remove (as character, integer, logical)
featureGroupsScreening (k) - The screening table A logical vector for the rows to delete.
mslists peak list mass peaks (data.table), feature group name, analysis name (NULL for group averaged data), type ("MS" or "MSMS") Mass peak indices (as integer, logical)
formulas, compounds feature group annotations (data.table), feature group name Candidate indices (rows)
components component component (data.table), component name The feature groups (as character, integer)

Some examples for this:

# remove features with intensities below 5000
fList <- delete(fList, j = function(f, ...) f$intensity <= 5E3)

# same, but for features in all feature groups from specific analyses
fGroups <- delete(fGroups, i = 1:3, j = function(g, ...) g <= 5E3)

# remove hits for suspects with mass above 400 Da
fGroupsSusp <- delete(fGroupsSusp, k = function(tab) tab$neutralMass > 400)

mslists <- delete(mslists, j = function(pl, grp, ana, type)
{
    if (!is.null(ana) || type == "MS")
        return(FALSE) # only delete peaks from group averaged MS/MS peak lists
    return(pl$mz > 500) # remove peaks with m/z > 500
})

# remove formula candidates with high relative mass deviation
formulas <- delete(formulas, j = function(ft, ...) ft$error > 5)