Basic rule based filtering of feature groups.
replicateGroupSubtract(fGroups, rGroups, threshold = 0)
# S4 method for class 'featureGroups'
filter(
obj,
absMinIntensity = NULL,
relMinIntensity = NULL,
preAbsMinIntensity = NULL,
preRelMinIntensity = NULL,
absMinAnalyses = NULL,
relMinAnalyses = NULL,
absMinReplicates = NULL,
relMinReplicates = NULL,
absMinFeatures = NULL,
relMinFeatures = NULL,
absMinReplicateAbundance = NULL,
relMinReplicateAbundance = NULL,
absMinConc = NULL,
relMinConc = NULL,
absMaxTox = NULL,
relMaxTox = NULL,
absMinConcTox = NULL,
relMinConcTox = NULL,
maxReplicateIntRSD = NULL,
blankThreshold = NULL,
retentionRange = NULL,
mzRange = NULL,
mzDefectRange = NULL,
chromWidthRange = NULL,
featQualityRange = NULL,
groupQualityRange = NULL,
rGroups = NULL,
results = NULL,
removeBlanks = FALSE,
removeISTDs = FALSE,
checkFeaturesSession = NULL,
predAggrParams = getDefPredAggrParams(),
removeNA = FALSE,
negate = FALSE
)
# S4 method for class 'featureGroupsSet'
filter(
obj,
...,
negate = FALSE,
sets = NULL,
absMinSets = NULL,
relMinSets = NULL
)
# S4 method for class 'featureGroups'
replicateGroupSubtract(fGroups, rGroups, threshold = 0)
featureGroups
object to which the filter is applied.
A character vector of replicate groups that should be kept (filter
) or subtracted from
(replicateGroupSubtract
).
Minimum relative threshold (compared to mean intensity of replicate group being subtracted) for a feature group to be not removed. When 0 a feature group is always removed when present in the given replicate groups.
Minimum absolute/relative intensity for features to be kept. The relative
intensity is determined from the feature with highest intensity (of
all features from all groups). Set to 0 or NULL
to skip this step.
As absMinIntensity
/relMinIntensity
, but applied
before any other filters. This is typically used to speed-up subsequent filter steps. However, care must be
taken that a sufficiently low value is chosen that is not expected to affect subsequent filtering steps. See below
why this may be important.
Feature groups are only kept when they contain data for at least this (absolute
or relative) amount of analyses. Set to NULL
to ignore.
Feature groups are only kept when they contain data for at least this
(absolute or relative) amount of replicates. Set to NULL
to ignore.
Analyses are only kept when they contain at least this (absolute or relative)
amount of features. Set to NULL
to ignore.
Minimum absolute/relative abundance that a grouped feature
should be present within a replicate group. If this minimum is not met all features within the replicate group are
removed. Set to NULL
to skip this step.
The minimum absolute/relative predicted concentration (calculated by
calculateConcs
) assigned to a feature. The toxicities are first aggregated prior to filtering, as
controlled by the predAggrParams
argument. Also see the removeNA
argument.
The maximum absolute/relative predicted toxicity (LC50) (calculated by
calculateTox
) assigned to a feature group. The concentrations are first aggregated prior to
filtering, as controlled by the predAggrParams
argument. Also see the removeNA
argument.
Like absMinConc
/relMinConc
, but instead considers the ratio between
feature concentrations and the toxicity of the feature group. For instance, absMinConcTox=0.1
means that the
calculated concentration of a feature should be at least 10% of its toxicity.
Maximum relative standard deviation (RSD) of intensity values for features within a
replicate group. If the RSD is above this value all features within the replicate group are removed. Set to
NULL
to ignore.
Feature groups that are also present in blank analyses (see
analysis info) are filtered out unless their relative intensity is above this
threshold. For instance, a value of 5 means that only features with an intensity five times higher than that
of the blank are kept. The relative intensity values between blanks and non-blanks are determined from the mean of
all non-zero blank intensities. Set to NULL
to skip this step.
Range of retention time (in seconds), m/z, mass
defect (defined as the decimal part of m/z values) or chromatographic peak width (in seconds), respectively.
Features outside this range will be removed. Should be a numeric vector with length of two containing the min/max
values. The maximum can be Inf
to specify no maximum range. Set to NULL
to skip this step.
Used to filter features by their peak qualities/scores
(see calculatePeakQualities
). Should be a named list
with min/max ranges for each
quality/score to be filtered (the featureQualityNames
function can be used to obtain valid names).
Example: featQualityRange=list(ModalityScore=c(0.3, Inf),
SymmetryScore=c(0.5, Inf))
. Set to NULL
to ignore.
Like featQualityRange
, but filters on group specific or averaged qualities/scores.
Only keep feature groups that have results in the object specified by results
. Valid classes
are featureAnnotations
(e.g. formula/compound annotations) and components
. Can
also be a list
with multiple objects: in this case a feature group is kept if it has a result in any
of the objects. Set to NULL
to ignore.
Set to TRUE
to remove all analyses that belong to replicate groups that are specified as a
blank in the analysis-information. This is useful to simplify the analyses in the specified
featureGroups
object after blank subtraction. When both blankThreshold
and this argument are
set, blank subtraction is performed prior to removing any analyses.
If TRUE
then all feature groups marked as internal standard (IS) are removed. This requires
IS assignments done by normInts
, see its documentation for more details.
If set then features and/or feature groups are removed that were selected for removal
(see check-GUI). The session files are typically generated with the checkFeatures
and
predictCheckFeaturesSession
functions. The value of checkFeaturesSession
should either by a
path to the session file or TRUE
, in which case the default session file name is used. If negate=TRUE
then all non-selected features/feature groups are removed instead.
Parameters to aggregate calculated concentrations/toxicities (obtained with
calculateConcs
/calculateTox
) prior to filtering data. See prediction aggregation
parameters for more information.
Set to TRUE
to remove NA
values. Currently only applicable to the concentration and
toxicity filters.
If set to TRUE
then filtering operations are performed in opposite manner.
1featureGroups
A character
with name(s) of the sets to keep (or remove if negate=TRUE
).
Feature groups are only kept when they contain data for at least this (absolute
or relative) amount of sets. Set to NULL
to ignore.
A filtered featureGroups
object. Feature groups that are filtered away have their intensity set
to zero. In case a feature group is not present in any of the analyses anymore it will be removed completely.
filter
performs common rule based filtering of feature groups such as blank subtraction, minimum
intensity and minimum replicate abundance. Removing of features occurs by zeroing their intensity values.
Furthermore, feature groups that are left completely empty (i.e. all intensities are zero) will be
automatically removed.
replicateGroupSubtract
removes feature groups present in a
given set of replicate groups (unless intensities are above a given
threshold). The replicate groups that are subtracted will be removed.
filter
has specific arguments to filter by (feature presence in) sets. See the argument descriptions.
When multiple arguments are specified to filter
, multiple filters are applied in
sequence. Since some of these filters may affect each other, choosing their order correctly may be important for
effective data filtering. For instance, when an intensity filter removes features from blank analyses, a subsequent
blank filter may not adequately perform blank subtraction. Similarly, when intensity and blank filters are executed
after the replicate abundance filter it may be necessary to ensure minimum replicate abundance again as the
intensity and blank filters may have removed some features within a replicate group.
With this in mind, filters (if specified) occur in the following order:
Features/feature groups selected for removal by the session specified by checkFeaturesSession
.
Pre-Intensity filters (i.e. preAbsMinIntensity
and preRelMinIntensity
).
Chromatography and mass filters (i.e retentionRange
, mzRange
, mzDefectRange
,
chromWidthRange
, featQualityRange
and groupQualityRange
).
Replicate abundance filters (i.e. absMinReplicateAbundance
, relMinReplicateAbundance
and
maxReplicateIntRSD
).
Blank filter (i.e. blankThreshold).
Intensity filters (i.e. absMinIntensity
and relMinIntensity
).
Replicate abundance filters (2nd time, only if previous filters affected results).
General abundance filters (i.e. absMinAnalyses
, relMinAnalyses
, absMinReplicates
,
relMinReplicates
, absMinFeatures
, relMinFeatures
), absMinConc
, relMinConc
,
absMaxTox
and relMaxTox
.
Replicate group filter (i.e. rGroups
), results filter (i.e. results
) and blank
analyses / internal standard removal (i.e. removeBlanks=TRUE
/ removeISTDs=TRUE
).
If another filtering order is desired then filter
should be called multiple times with only one filter
argument at a time.