R/generics.R
, R/feature_groups-screening.R
, R/feature_groups-screening-set.R
featureGroupsScreening-class.Rd
This class derives from featureGroups
and adds suspect screening information.
screenInfo(obj)
annotateSuspects(
fGroups,
MSPeakLists = NULL,
formulas = NULL,
compounds = NULL,
...
)
# S4 method for class 'featureGroupsScreening'
screenInfo(obj)
# S4 method for class 'featureGroupsScreening'
show(object)
# S4 method for class 'featureGroupsScreening,ANY,ANY,missing'
x[i, j, ..., rGroups, suspects = NULL, drop = TRUE]
# S4 method for class 'featureGroupsScreening'
delete(obj, i = NULL, j = NULL, ...)
# S4 method for class 'featureGroupsScreening'
as.data.table(x, ..., collapseSuspects = ",", onlyHits = FALSE)
# S4 method for class 'featureGroupsScreening'
annotateSuspects(
fGroups,
MSPeakLists,
formulas,
compounds,
absMzDev = 0.005,
specSimParams = getDefSpecSimParams(removePrecursor = TRUE),
checkFragments = c("mz", "formula", "compound"),
formulasNormalizeScores = "max",
compoundsNormalizeScores = "max",
IDFile = system.file("misc", "IDLevelRules.yml", package = "patRoon"),
logPath = file.path("log", "ident")
)
# S4 method for class 'featureGroupsScreening'
filter(
obj,
...,
onlyHits = NULL,
selectHitsBy = NULL,
selectBestFGroups = FALSE,
maxLevel = NULL,
maxFormRank = NULL,
maxCompRank = NULL,
minAnnSimForm = NULL,
minAnnSimComp = NULL,
minAnnSimBoth = NULL,
absMinFragMatches = NULL,
relMinFragMatches = NULL,
minRF = NULL,
maxLC50 = NULL,
negate = FALSE
)
# S4 method for class 'featureGroupsScreeningSet'
screenInfo(obj)
# S4 method for class 'featureGroupsScreeningSet'
show(object)
# S4 method for class 'featureGroupsScreeningSet,ANY,ANY,missing'
x[i, j, ..., rGroups, suspects = NULL, sets = NULL, drop = TRUE]
# S4 method for class 'featureGroupsScreeningSet'
delete(obj, i = NULL, j = NULL, ...)
# S4 method for class 'featureGroupsScreeningSet'
as.data.table(x, ..., collapseSuspects = ",", onlyHits = FALSE)
# S4 method for class 'featureGroupsScreeningSet'
annotateSuspects(
fGroups,
MSPeakLists,
formulas,
compounds,
absMzDev = 0.005,
specSimParams = getDefSpecSimParams(removePrecursor = TRUE),
checkFragments = c("mz", "formula", "compound"),
formulasNormalizeScores = "max",
compoundsNormalizeScores = "max",
IDFile = system.file("misc", "IDLevelRules.yml", package = "patRoon"),
logPath = file.path("log", "ident")
)
# S4 method for class 'featureGroupsScreeningSet'
filter(
obj,
...,
onlyHits = NULL,
selectHitsBy = NULL,
selectBestFGroups = FALSE,
maxLevel = NULL,
maxFormRank = NULL,
maxCompRank = NULL,
minAnnSimForm = NULL,
minAnnSimComp = NULL,
minAnnSimBoth = NULL,
absMinFragMatches = NULL,
relMinFragMatches = NULL,
minRF = NULL,
maxLC50 = NULL,
negate = FALSE
)
# S4 method for class 'featureGroupsScreeningSet'
unset(obj, set)
The featureGroupsScreening
object.
Annotation data (MSPeakLists
, formulas
and
compounds
) obtained for this featureGroupsScreening
object. All arguments can be NULL
to exclude it from the annotation.
Further arguments passed to the base method.
Used for subsetting data analyses, feature groups and
replicate groups, see featureGroups
.
An optional character
vector with suspect names. If
specified, only featureGroups
will be kept that are assigned to
these suspects.
Ignored.
If a character
then any suspects that were matched to the same feature group are
collapsed to a single row and suspect names are separated by the value of collapseSuspects
. If NULL
then no collapsing occurs, and each suspect match is reported on a single row. See the Suspect collapsing
section below for additional details.
For as.data.table
: if TRUE
then only feature groups with suspect hits are reported.
For filter
if negate=FALSE
and onlyHits=TRUE
then all feature groups without suspect hits will be removed.
Otherwise nothing will be done.
if negate=TRUE
then onlyHits=TRUE
will select feature groups without suspect hits,
onlyHits=FALSE
will only retain feature groups with suspect matches and this filter is ignored if
onlyHits=NULL
.
Maximum absolute m/z deviation.
A named list
with parameters that influence the calculation of MS spectra similarities.
See the spectral similarity parameters documentation for more details.
Which type(s) of MS/MS fragments from workflow data should be checked to evaluate the number of
suspect fragment matches (i.e. from the fragments_mz
/fragments_formula
columns in the suspect
list). Valid values are: "mz"
, "formula"
, "compounds"
. The former uses m/z values in
the specified MSPeakLists
object, whereas the others use the formulae that were annotated to MS/MS peaks in
the given formulas
or compounds
objects. Multiple values are possible: in this case the maximum
number of fragment matches will be reported.
A character
that specifies how normalization of
annotation scorings occurs. Either
"max"
(normalize to max value) or "minmax"
(perform min-max
normalization). Note that normalization of negative scores (e.g. output by
SIRIUS
) is always performed as min-max. Furthermore, currently
normalization for compounds
takes the original min/max scoring
values into account when candidates were generated. Thus, for
compounds
scoring, normalization is not affected when candidate
results were removed after they were generated (e.g. by use of
filter
).
A file path to a YAML file with rules used for estimation of identification levels. See the
Suspect annotation
section for more details. If not specified then a default rules file will be used.
A directory path to store logging information. If NULL
then logging is disabled.
Should be "intensity"
or "level"
. For cases where the same suspect is matched to
multiple feature groups, only the suspect to the feature group with highest mean intensity
(selectHitsBy="intensity"
) or best identification level (selectHitsBy="level"
) is kept. In case of
ties only the first hit is kept. Set to NULL
to ignore this filter. If negate=TRUE
then only those
hits with lowest mean intensity/poorest identification level are kept.
If TRUE
then for any cases where a single feature group is matched to several
suspects only the suspect assigned to the feature group with best identification score is kept. In case of ties
only the first is kept.
Filter suspects by maximum
identification level (e.g. "3a"
), formula/compound rank or with minimum formula/compound/combined
annotation similarity. Set to NULL
to ignore.
Only retain suspects with this minimum number MS/MS matches with the
fragments specified in the suspect list (i.e. fragments_mz
/fragments_formula
).
relMinFragMatches
sets the minimum that is relative (0-1) to the maximum number of MS/MS fragments
specified in the fragments_*
columns of the suspect list. Set to NULL
to ignore.
Filter suspect hits by the given minimum predicted response factor (as calculated by
predictRespFactors
). Set to NULL
to ignore.
Filter suspect hits by the given maximum toxicity (LC50) (as calculated by
predictTox
). Set to NULL
to ignore.
If set to TRUE
then filtering operations are performed in opposite manner.
A character
with name(s) of the sets to keep (or remove if negate=TRUE
).
The name of the set.
annotateSuspects
returns a featureGroupsScreening
object, which is a
featureGroups
object amended with annotation data.
filter
returns a filtered featureGroupsScreening
object.
screenInfo(featureGroupsScreening)
: Returns a table with screening information
(see screenInfo
slot).
show(featureGroupsScreening)
: Shows summary information for this object.
x[i
: Subset on analyses, feature groups and/or
suspects.
as.data.table(featureGroupsScreening)
: Obtain a summary table (a data.table
) with retention, m/z,
intensity and optionally other feature data. Furthermore, the output table will be merged with information from
screenInfo
, such as suspect names and other properties and annotation data.
annotateSuspects(featureGroupsScreening)
: Incorporates annotation data obtained during the workflow to annotate suspects
with matched known MS/MS fragments, formula/candidate ranks and automatic estimation of identification levels. See
the Suspect annotation
section for more details. The estimation of identification levels for each suspect is
logged in the log/ident
directory.
filter(featureGroupsScreening)
: Performs rule based filtering. This method builds on the comprehensive filter
functionality from the base filter,featureGroups-method
. It adds several filters to select
e.g. the best ranked suspects or those with a minimum estimated identification level. NOTE: most
filters only affect suspect hits, not feature groups. Set onlyHits=TRUE
to subsequently remove any
feature groups that lost any suspect matches due to other filter steps.
screenInfo
A (data.table
) with results from suspect screening. This table will be amended with
annotation data when annotateSuspects
is run.
MS2QuantMeta
Metadata from MS2Quant filled in by predictRespFactors
.
filter
removes suspect hits with NA
values when any of the filters related to minimum or maximum
values are applied (unless negate=TRUE
).
The annotateSuspects
method is used to annotate suspects after
screenSuspects
was used to collect suspect screening results and other workflow steps such as formula
and compound annotation steps have been completed. The annotation results, which can be acquired with the
as.data.table
and screenInfo
methods, amends the current screening data with the following columns:
formRank
,compRank
The rank of the suspect within the formula/compound annotation results.
annSimForm
,annSimComp
,annSimBoth
A similarity measure between measured and annotated
MS/MS peaks from annotation of formulae, compounds or both. The similarity is calculated as the spectral similarity
between a peaklist with (a) all MS/MS peaks and (b) only annotated peaks. Thus, a value of one means that all MS/MS
peaks were annotated. If both formula and compound annotations are available then annSimBoth
is calculated
after combining all the annotated peaks, otherwise annSimBoth
equals the available value for
annSimForm
or annSimComp
. The similarity calculation can be configured with the specSimParams
argument to annotateSuspects
. Note for annotation with generateCompoundsLibrary
results: the method
and default parameters for annSimComp
calculation slightly differs to those from the spectral similarity
calculated with compound annotation (libMatch
score), hence small differences in results are typically
observed.
maxFrags
The maximum number of MS/MS fragments that can be matched for this suspect (based on the
fragments_*
columns from the suspect list).
maxFragMatches
,maxFragMatchesRel
The absolute and relative amount of experimental MS/MS peaks
that were matched from the fragments specified in the suspect list. The value for maxFragMatchesRel
is
relative to the value for maxFrags
. The calculation of this column is influenced by the
checkFragments
argument to annotateSuspects
.
estIDLevel
Provides an estimation of the identification level, roughly following that of
Schymanski2014patRoon. However, please note that this value is only an estimation, and manual
interpretation is still necessary to assign final identification levels. The estimation is done through a set of
rules, see the Identification level rules
section below.
Note that only columns are present if sufficient data is available for their calculation.
The estimation of identification levels is configured through a YAML file which specifies the rules for each level. The default file is shown below.
1:
suspectFragments: 3
retention: 12
2a:
or:
- individualMoNAScore:
min: 0.9
higherThanNext: .inf
- libMatch:
min: 0.9
higherThanNext: .inf
rank:
max: 1
type: compound
3a:
or:
- individualMoNAScore: 0.4
- libMatch: 0.4
3b:
suspectFragments: 3
3c:
annMSMSSim:
type: compound
min: 0.7
4a:
annMSMSSim:
type: formula
min: 0.7
isoScore:
min: 0.5
higherThanNext: 0.2
rank:
max: 1
type: formula
4b:
isoScore:
min: 0.9
higherThanNext: 0.2
rank:
max: 1
type: formula
5:
all: yes
Most of the file should be self-explanatory. Some notes:
Each rule is either a field of suspectFragments
(minimum number of MS/MS fragments matched from
suspect list), retention
(maximum retention deviation from suspect list), rank
(the maximum
annotation rank from formula or compound annotations), all
(this level is always matched) or any of the
scorings available from the formula or compound annotations.
In case any of the rules could be applied to either formula or compound annotations, the annotation type must
be specified with the type
field (formula
or compound
).
Identification levels should start with a number and may optionally be followed by a alphabetic character. The lowest levels are checked first.
If relative=yes
then the relative scoring will be used for testing.
For suspectFragments
: if the number of fragments from the suspect list (maxFrags
column) is
less then the minimum rule value, the minimum is adjusted to the number of available fragments.
The or
and and
keywords can be used to combine multiple conditions.
A template rules file can be generated with the genIDLevelRulesFile
function, and this file can
subsequently passed to annotateSuspects
. The file format is highly flexible and (sub)levels can be added or
removed if desired. Note that the default file is currently only suitable when annotation is performed with GenForm
and MetFrag, for other algorithms it is crucial to modify the rules.
featureGroupsScreening
featureGroupsSetScreeningUnset
The as.data.table
method fir featureGroupsScreening
supports an
additional format where each suspect hit is reported on a separate row (enabled by setting
collapseSuspects=NULL
). In this format the suspect
properties from the screenInfo
method are merged with each suspect row. Alternatively, if suspect
collapsing is enabled (the default) then the regular as.data.table
format is used, and amended with the
names of all suspects matched to a feature group (separated by the value of the collapseSuspects
argument).
Suspect collapsing also influences how calculated feature concentrations/toxicities are reported (i.e.
obtained with calculateConcs
/calculateTox
). If these values were directly predicted for
suspects, i.e. by using predictRespFactors
/predictTox
on the feature groups
object, and suspects are not collapsed, then the calculated concentration/toxicity reported for each
suspect row is not aggregated and specific for that suspect (unless not available). Hence, this allows you to
obtain specific concentration/toxicity values for each suspect/feature group pair.
featureGroupsScreeningSetfeatureGroupsScreening
featureGroupsScreeningUnsetOnly the screening results present in the specified set are kept.
annotateSuspects
Suspect annotation is performed per set. Thus, formula/compound ranks, estimated
identification levels etc are calculated for each set. Subsequently, these results are merged in the final
screenInfo
. In addition, an overall formRank
and compRank
column is created based on the
rankings of the suspect candidate in the set consensus data. Furthermore, an overall estIDLevel
is generated
that is based on the 'best' estimated identification level among the sets data (i.e. the lowest). In case
there is a tie between sub-levels (e.g. 3a and 3b), then the sub-level is stripped
(e.g. 3). filter
All filters related to estimated identification levels and formula/compound rankings are
applied to the overall set data (see above). All others are applied to set specific data: in this case candidates
are only removed if none of the set data confirms to the filter.
This class derives also from featureGroupsSet
. Please see its documentation for more relevant details
with sets workflows.
Note that the formRank
and compRank
columns are not updated when the data is subset.
Stein1994patRoon