Basic rule based filtering of feature groups.

replicateGroupSubtract(fGroups, rGroups, threshold = 0)

# S4 method for featureGroups
filter(
  obj,
  absMinIntensity = NULL,
  relMinIntensity = NULL,
  preAbsMinIntensity = NULL,
  preRelMinIntensity = NULL,
  absMinAnalyses = NULL,
  relMinAnalyses = NULL,
  absMinReplicates = NULL,
  relMinReplicates = NULL,
  absMinFeatures = NULL,
  relMinFeatures = NULL,
  absMinReplicateAbundance = NULL,
  relMinReplicateAbundance = NULL,
  absMinConc = NULL,
  relMinConc = NULL,
  absMaxTox = NULL,
  relMaxTox = NULL,
  absMinConcTox = NULL,
  relMinConcTox = NULL,
  maxReplicateIntRSD = NULL,
  blankThreshold = NULL,
  retentionRange = NULL,
  mzRange = NULL,
  mzDefectRange = NULL,
  chromWidthRange = NULL,
  featQualityRange = NULL,
  groupQualityRange = NULL,
  rGroups = NULL,
  results = NULL,
  removeBlanks = FALSE,
  removeISTDs = FALSE,
  checkFeaturesSession = NULL,
  predAggrParams = getDefPredAggrParams(),
  removeNA = FALSE,
  negate = FALSE
)

# S4 method for featureGroupsSet
filter(
  obj,
  ...,
  negate = FALSE,
  sets = NULL,
  absMinSets = NULL,
  relMinSets = NULL
)

# S4 method for featureGroups
replicateGroupSubtract(fGroups, rGroups, threshold = 0)

Arguments

fGroups, obj

featureGroups object to which the filter is applied.

rGroups

A character vector of replicate groups that should be kept (filter) or subtracted from (replicateGroupSubtract).

threshold

Minimum relative threshold (compared to mean intensity of replicate group being subtracted) for a feature group to be not removed. When 0 a feature group is always removed when present in the given replicate groups.

absMinIntensity, relMinIntensity

Minimum absolute/relative intensity for features to be kept. The relative intensity is determined from the feature with highest intensity (of all features from all groups). Set to 0 or NULL to skip this step.

preAbsMinIntensity, preRelMinIntensity

As absMinIntensity/relMinIntensity, but applied before any other filters. This is typically used to speed-up subsequent filter steps. However, care must be taken that a sufficiently low value is chosen that is not expected to affect subsequent filtering steps. See below why this may be important.

absMinAnalyses, relMinAnalyses

Feature groups are only kept when they contain data for at least this (absolute or relative) amount of analyses. Set to NULL to ignore.

absMinReplicates, relMinReplicates

Feature groups are only kept when they contain data for at least this (absolute or relative) amount of replicates. Set to NULL to ignore.

absMinFeatures, relMinFeatures

Analyses are only kept when they contain at least this (absolute or relative) amount of features. Set to NULL to ignore.

absMinReplicateAbundance, relMinReplicateAbundance

Minimum absolute/relative abundance that a grouped feature should be present within a replicate group. If this minimum is not met all features within the replicate group are removed. Set to NULL to skip this step.

absMinConc, relMinConc

The minimum absolute/relative predicted concentration (calculated by calculateConcs) assigned to a feature. The toxicities are first aggregated prior to filtering, as controlled by the predAggrParams argument. Also see the removeNA argument.

absMaxTox, relMaxTox

The maximum absolute/relative predicted toxicity (LC50) (calculated by calculateTox) assigned to a feature group. The concentrations are first aggregated prior to filtering, as controlled by the predAggrParams argument. Also see the removeNA argument.

absMinConcTox, relMinConcTox

Like absMinConc/relMinConc, but instead considers the ratio between feature concentrations and the toxicity of the feature group. For instance, absMinConcTox=0.1 means that the calculated concentration of a feature should be at least 10% of its toxicity.

maxReplicateIntRSD

Maximum relative standard deviation (RSD) of intensity values for features within a replicate group. If the RSD is above this value all features within the replicate group are removed. Set to NULL to ignore.

blankThreshold

Feature groups that are also present in blank analyses (see analysis info) are filtered out unless their relative intensity is above this threshold. For instance, a value of 5 means that only features with an intensity five times higher than that of the blank are kept. The relative intensity values between blanks and non-blanks are determined from the mean of all non-zero blank intensities. Set to NULL to skip this step.

retentionRange, mzRange, mzDefectRange, chromWidthRange

Range of retention time (in seconds), m/z, mass defect (defined as the decimal part of m/z values) or chromatographic peak width (in seconds), respectively. Features outside this range will be removed. Should be a numeric vector with length of two containing the min/max values. The maximum can be Inf to specify no maximum range. Set to NULL to skip this step.

featQualityRange

Used to filter features by their peak qualities/scores (see calculatePeakQualities). Should be a named list with min/max ranges for each quality/score to be filtered (the featureQualityNames function can be used to obtain valid names). Example: featQualityRange=list(ModalityScore=c(0.3, Inf), SymmetryScore=c(0.5, Inf)). Set to NULL to ignore.

groupQualityRange

Like featQualityRange, but filters on group specific or averaged qualities/scores.

results

Only keep feature groups that have results in the object specified by results. Valid classes are featureAnnotations (e.g. formula/compound annotations) and components. Can also be a list with multiple objects: in this case a feature group is kept if it has a result in any of the objects. Set to NULL to ignore.

removeBlanks

Set to TRUE to remove all analyses that belong to replicate groups that are specified as a blank in the analysis-information. This is useful to simplify the analyses in the specified featureGroups object after blank subtraction. When both blankThreshold and this argument are set, blank subtraction is performed prior to removing any analyses.

removeISTDs

If TRUE then all feature groups marked as internal standard (IS) are removed. This requires IS assignments done by normInts, see its documentation for more details.

checkFeaturesSession

If set then features and/or feature groups are removed that were selected for removal (see check-GUI). The session files are typically generated with the checkFeatures and predictCheckFeaturesSession functions. The value of checkFeaturesSession should either by a path to the session file or TRUE, in which case the default session file name is used. If negate=TRUE then all non-selected features/feature groups are removed instead.

predAggrParams

Parameters to aggregate calculated concentrations/toxicities (obtained with calculateConcs/calculateTox) prior to filtering data. See prediction aggregation parameters for more information.

removeNA

Set to TRUE to remove NA values. Currently only applicable to the concentration and toxicity filters.

negate

If set to TRUE then filtering operations are performed in opposite manner.

...

For sets workflow methods: further arguments passed to the base featureGroups method.

sets

(sets workflow) A character with name(s) of the sets to keep (or remove if negate=TRUE).

absMinSets, relMinSets

(sets workflow) Feature groups are only kept when they contain data for at least this (absolute or relative) amount of sets. Set to NULL to ignore.

Value

A filtered featureGroups object. Feature groups that are filtered away have their intensity set to zero. In case a feature group is not present in any of the analyses anymore it will be removed completely.

Details

filter performs common rule based filtering of feature groups such as blank subtraction, minimum intensity and minimum replicate abundance. Removing of features occurs by zeroing their intensity values. Furthermore, feature groups that are left completely empty (i.e. all intensities are zero) will be automatically removed.

replicateGroupSubtract removes feature groups present in a given set of replicate groups (unless intensities are above a given threshold). The replicate groups that are subtracted will be removed.

Sets workflows

The following methods are changed or with new functionality:

  • filter has specific arguments to filter by (feature presence in) sets. See the argument descriptions.

Filter order

When multiple arguments are specified to filter, multiple filters are applied in sequence. Since some of these filters may affect each other, choosing their order correctly may be important for effective data filtering. For instance, when an intensity filter removes features from blank analyses, a subsequent blank filter may not adequately perform blank subtraction. Similarly, when intensity and blank filters are executed after the replicate abundance filter it may be necessary to ensure minimum replicate abundance again as the intensity and blank filters may have removed some features within a replicate group.

With this in mind, filters (if specified) occur in the following order:

  1. Features/feature groups selected for removal by the session specified by checkFeaturesSession.

  2. Pre-Intensity filters (i.e. preAbsMinIntensity and preRelMinIntensity).

  3. Chromatography and mass filters (i.e retentionRange, mzRange, mzDefectRange, chromWidthRange, featQualityRange and groupQualityRange).

  4. Replicate abundance filters (i.e. absMinReplicateAbundance, relMinReplicateAbundance and maxReplicateIntRSD).

  5. Blank filter (i.e. blankThreshold).

  6. Intensity filters (i.e. absMinIntensity and relMinIntensity).

  7. Replicate abundance filters (2nd time, only if previous filters affected results).

  8. General abundance filters (i.e. absMinAnalyses, relMinAnalyses, absMinReplicates, relMinReplicates, absMinFeatures, relMinFeatures), absMinConc, relMinConc, absMaxTox and relMaxTox.

  9. Replicate group filter (i.e. rGroups), results filter (i.e. results) and blank analyses / internal standard removal (i.e. removeBlanks=TRUE / removeISTDs=TRUE).

If another filtering order is desired then filter should be called multiple times with only one filter argument at a time.