R/main.R, R/feature_groups.R, R/feature_groups-set.R, and 8 more
pred-quant.RdFunctions to predict response factors and feature concentrations from SMILES and/or
SIRIUS+CSI:FingerID fingerprints using the MS2Quant package.
# S4 method for class 'featureGroups'
calculateConcs(fGroups, featureAnn, areas = FALSE)
# S4 method for class 'featureGroupsSet'
calculateConcs(fGroups, featureAnn, areas = FALSE)
# S4 method for class 'compounds'
predictRespFactors(
obj,
fGroups,
calibrants,
eluent,
organicModifier,
pHAq,
concUnit = "ugL",
calibConcUnit = concUnit,
updateScore = FALSE,
scoreWeight = 1,
parallel = TRUE
)
# S4 method for class 'featureGroupsScreening'
predictRespFactors(
obj,
calibrants,
eluent,
organicModifier,
pHAq,
concUnit = "ugL",
calibConcUnit = concUnit
)
# S4 method for class 'featureGroupsScreening'
calculateConcs(fGroups, featureAnn = NULL, areas = FALSE)
# S4 method for class 'featureGroupsScreeningSet'
predictRespFactors(obj, calibrants, ...)
# S4 method for class 'featureGroupsScreeningSet'
calculateConcs(fGroups, featureAnn = NULL, areas = FALSE)
# S4 method for class 'compoundsSet'
predictRespFactors(obj, fGroups, calibrants, ...)
# S4 method for class 'compoundsSIRIUS'
predictRespFactors(
obj,
fGroups,
calibrants,
eluent,
organicModifier,
pHAq,
concUnit = "ugL",
calibConcUnit = concUnit,
type = "FP"
)
# S4 method for class 'formulasSet'
predictRespFactors(obj, fGroups, calibrants, ...)
# S4 method for class 'formulasSIRIUS'
predictRespFactors(
obj,
fGroups,
calibrants,
eluent,
organicModifier,
pHAq,
concUnit = "ugL",
calibConcUnit = concUnit
)
getQuantCalibFromScreening(fGroups, concs, areas = FALSE, average = FALSE)For predictRespFactors methods for feature annotations: The featureGroups object
for which the annotations were performed.
For calculateConcs: The featureGroups object for which concentrations should be calculated.
For getQuantCalibFromScreening: A feature groups object screened for the calibrants with
screenSuspects.
A featureAnnotations object (e.g. formulasSIRIUS or
compounds) which contains response factors. Optional if calculateConcs is called on suspect
screening results (i.e. featureGroupsScreening method).
Set to TRUE to use peak areas instead of peak heights. Note: for calculateConcs this
should follow what is in the calibrants table.
The workflow object for which predictions should be performed, e.g. feature groups with screening
results (featureGroupsScreening) or compound annotations (compounds).
A data.frame with calibrants, see the Calibration section below.
(sets workflow) Should be a list with the calibrants for each set.
A data.frame that describes the LC gradient program. Should have a column time with the
retention time in seconds and a column B with the corresponding percentage of the organic modifier
(0-100).
The organic modifier of the mobile phase: either "MeOH" (methanol) or "MeCN"
(acetonitrile).
The pH of the aqueous part of the mobile phase.
The concentration unit for calculated concentrations. Can be molar based ("nM", "uM",
"mM", "M") or mass based ("ngL", "ugL", "mgL", "gL"). Furthermore, can be
prefixed with "log " for logarithmic concentrations (e.g. "log mM").
The concentration unit used in the calibrants table. For possible values see the concUnit
argument.
If updateScore=TRUE then the annotation score column is updated
by adding normalized values of the response factor (weighted by scoreWeight). Currently, this
only makes sense for annotations performed with MetFrag!
If set to TRUE then code is executed in parallel through the futures package. Please
see the parallelization section in the handbook for more details.
(sets workflow) Further arguments passed to the non-sets workflow method.
Which types of predictions should be performed: should be "FP" (SIRIUS+CSI:FingerID
fingerprints), "SMILES" or "both". Only relevant for compoundsSIRIUS method.
A data.frame with concentration data. See the Calibration section below.
Set to TRUE to average intensity values within replicate groups.
predictRespFactors returns an object amended with response factors (RF_SMILES/LRF_SIRFP
columns).
calculateConcs returns a featureGroups based object amended with concentrations for each
feature group (accessed with the concentrations method).
The MS2Quant R package predicts concentrations from SMILES
and/or MS/MS fingerprints obtained with SIRIUS+CSI:FingerID. The predictRespFactors method functions
interface with this package to calculate response factors, which can then be used to calculate feature concentrations
with the calculateConcs method function.
The rcdk package and OpenBabel tool are used
internally to calculate molecular weights. Please make sure that OpenBabel is installed.
MS2Quant currently only supports M+H and M+ adducts when performing predictions with
SIRIUS:FingerID fingerprints. Predictions for candidates with other adducts, including M-H], are
skipped with a warning.
The MS2Quant package requires calibration to convert predicted ionization efficiencies to
instrument/method specific response factors. The calibration data should be specified with the calibrants
argument to predictRespFactors. This should be a data.frame with intensity observations at different
concentrations for a set of calibrants. Each row specifies one intensity observation at one concentration. The
table should have the following columns:
name The name of the calibrant. Can be freely chosen.
SMILES The SMILES of the calibrant.
rt The retention time of the calibrant (in seconds).
intensity The peak intensity (or area, see the areas argument) of the calibrant.
conc The concentration of the calibrant (see the calibConcUnit argument for specifying the unit).
It is recommended to include multiple calibrants (e.g. >=10) at multiple concentrations (e.g.
>=5). The latter is achieved by adding multiple rows for the same calibrant (keeping the
name/SMILES/rt columns constant). It is also possible to follow the column naming used by
MS2Quant (however retention times should still be in seconds!). For more details and tips see
https://github.com/kruvelab/MS2Quant.
The getQuantCalibFromScreening function can be used to automatically generate a calibrants table from a
feature groups object with suspect screening results. Here, the idea is to perform a screening with
screenSuspects with a suspect list that contain the calibrants, which is then used to construct the
calibrant table. It is highly recommended to add retention times for the calibrants in the suspect list to ensure
the calibrant is assigned to the correct feature. Furthermore, it is possible to simply add the calibrants to the
'regular' suspect list in case a suspect screening was already part of the workflow. The
getQuantCalibFromScreening function still requires you to specify concentration data, which is achieved via
the concs argument. This should be a data.frame with a column name corresponding to the
calibrant name (i.e. same as used by screenSuspects above) and columns with concentration data. The
latter columns specify the concentrations of a calibrant in different replicate groups (as defined in the
analysis information). The concentration columns should be named after the
corresponding replicate group. Only those replicate groups that should be used for calibration need to be included.
Furthermore, NA values can be used if a replicate group should be ignored for a specific calibrant.
The response factors are predicted with the predictRespFactors generic functions,
which accepts the following input:
Suspect screening results. The SMILES data is used to predict response factors for suspect hits.
Formula annotation data obtained with "sirius" algorithm (generateFormulasSIRIUS). The
predictions are performed for each formula candidate using SIRIUS+CSI:FingerID fingerprints. For this
reason, the getFingerprint argument must be set to TRUE when generating the formula data.
Compound annotation data obtained with the "sirius" algorithm (generateCompoundsSIRIUS).
The predictions are performed for each annotation candidate using its SMILES and/or
SIRIUS+CSI:FingerID fingerprints. The predictions are performed on a per formula basis, hence,
response factors for isomers will be equal.
Compound annotation data obtained with algorithms other than "sirius". The response factors are predicted
from SMILES data.
When SMILES data is used then predictions of response factors are generally more accurate. However,
calculations with SIRIUS+CSI:FingerID fingerprints are faster and only require the formula and MS/MS
spectrum, i.e. not the full structure. Hence, calculations with SMILES are mostly useful in
suspect screening workflows, or with high confidence compound annotation data, whereas MS/MS fingerprints are
suitable with unknowns.
For annotation data the calculations are performed for all candidates. This can especially lead to long
running calculations when SMILES data is used. Hence, it is strongly recommended to first
prioritize the annotation results, e.g. with the topMost argument to the
filter method.
When response factors are predicted from SIRIUS+CSI:FingerID fingerprints then only formula and MS/MS
spectra are used, even if compound annotations are used for input. The major difference is that with formula
annotation input all formula candidates for which a fingerprint could be generated are considered, whereas
with compound annotations only candidate formulae are considered for which also a structure could be assigned.
Hence, the formula annotation input could be more comprehensive, whereas predictions from structure annotations
could lead to more representative results as only formulae are considered for which at least one structure could be
assigned.
The calculateConcs generic function is used to assign concentrations for each
feature using the response factors discussed in the previous section. The function takes response factors from suspect
screening results and/or feature annotation data. If multiple response factors were predicted for the same feature
group, for instance when multiple annotation candidates or suspect hits for this feature group are present, then a
concentrations is assigned for all response factors. These values can later be easily aggregated with e.g. the
as.data.table function.
OBoyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011).
“Open Babel: An open chemical toolbox.”
Journal of Cheminformatics, 3(1).
doi:10.1186/1758-2946-3-33
.
Guha R (2007).
“Chemical Informatics Functionality in R.”
Journal of Statistical Software, 18(6).
Sepman H, Malm L, Peets P, MacLeod M, Martin J, Breitholtz M, Kruve A (2023). “Bypassing the Identification: MS2Quant for Concentration Estimations of Chemicals Detected with Nontarget LC-HRMS from MS2 Data.” Analytical Chemistry, 95(33), 12329–12338. doi:10.1021/acs.analchem.3c01744 , https://doi.org/10.1021/acs.analchem.3c01744.