R/generics.R
, R/compounds-sirius.R
generateCompoundsSIRIUS.Rd
Uses SIRIUS in combination with CSI:FingerID for compound annotation.
generateCompoundsSIRIUS(fGroups, ...)
# S4 method for class 'featureGroups'
generateCompoundsSIRIUS(
fGroups,
MSPeakLists,
relMzDev = 5,
adduct = NULL,
projectPath = NULL,
elements = "CHNOP",
profile = "qtof",
formulaDatabase = NULL,
fingerIDDatabase = "pubchem",
noise = NULL,
cores = NULL,
topMost = 100,
topMostFormulas = 5,
login = "check",
alwaysLogin = FALSE,
extraOptsGeneral = NULL,
extraOptsFormula = NULL,
verbose = TRUE,
splitBatches = FALSE,
dryRun = FALSE
)
# S4 method for class 'featureGroupsSet'
generateCompoundsSIRIUS(
fGroups,
MSPeakLists,
relMzDev = 5,
adduct = NULL,
projectPath = NULL,
...,
setThreshold = 0,
setThresholdAnn = 0,
setAvgSpecificScores = FALSE
)
featureGroups
object which should be annotated. This should be the same or a subset of
the object that was used to create the specified MSPeakLists
. In the case of a subset only the remaining
feature groups in the subset are considered.
Further arguments passed to the non-sets workflow method.
A MSPeakLists
object that was generated for the supplied fGroups
.
Maximum relative deviation between the measured and candidate formula m/z values (in ppm). Sets the --ppm-max command line option.
An adduct
object (or something that can be converted to it with as.adduct
).
Examples: "[M-H]-"
, "[M+Na]+"
. If the featureGroups
object has
adduct annotations then these are used if adducts=NULL
.
The adduct
argument is not supported for sets workflows, since the
adduct annotations will then always be used.
These are mainly for internal purposes. projectPath
sets the output directory for
the SIRIUS
output (a temporary directory if NULL
). If dryRun
is TRUE
then no
computations are done and only the results from projectPath
are processed.
projectPath
should be a character
specifying the paths for each set.
Elements to be considered for formulae calculation. This will heavily affects the number of
candidates! Always try to work with a minimal set by excluding elements you don't expect. The minimum/maximum
number of elements can also be specified, for example: a value of "C[5]H[10-15]O"
will only consider
formulae with up to five carbon atoms, between ten and fifteen hydrogen atoms and any amount of oxygen atoms. Sets
the --elements command line option.
Name of the configuration profile, for example: "qtof", "orbitrap", "fticr". Sets the --profile commandline option.
If not NULL
, use a database for retrieval of formula
candidates. Possible values are: "pubchem", "bio", "kegg", "hmdb". Sets the
--database commandline option.
Database specifically used for CSI:FingerID
. If NULL
, the value of the
formulaDatabase
parameter will be used or "pubchem"
when that is also NULL
. Sets the
--fingerid-db option.
Median intensity of the noise (NULL
ignores this parameter). Sets the --noise
commandline option.
The number of cores SIRIUS
will use. If NULL
then the default of all cores will be
used.
Only keep this number of candidates (per feature group) with highest score. Set to NULL
to
always keep all candidates, however, please note that this may result in significant usage of CPU/RAM resources for
large numbers of candidates.
Do not return more than this number of candidate formulae. Note that only compounds for these formulae will be searched. Sets the --candidates commandline option.
Specifies if and how account logging of SIRIUS should be handled:
login=FALSE
: no automatic login is performed and the active login status is not checked.
login="check"
: aborts if no active login is present.
login="interactive"
: interactively ask for login (using getPass).
login=c(username="...", password="...")
: perform the login with the given details. For security reasons,
please do not enter the details directly, but use e.g. environment variables or store/retrieve them with the
keyring package.
if alwaysLogin=TRUE
then a login is always performed, otherwise only if SIRIUS reports no active login.
See the SIRIUS website and patRoon handbook for more information.
a character
vector with any extra commandline parameters for
SIRIUS
. For SIRIUS
versions <4.4
there is no distinction between general and formula
options. Otherwise commandline options specified in extraOptsGeneral
are added prior to the formula
command, while options specified in extraOptsFormula
are added in afterwards. See the SIRIUS
manual for more details. Set to NULL
to ignore.
If TRUE
then more output is shown in the terminal.
If TRUE
then the calculations done by SIRIUS
will be evenly split over multiple
SIRIUS
calls (which may be run in parallel depending on the set package
options). If splitBatches=FALSE
then all feature calculations are performed from a single SIRIUS
execution, which is often the fastest if calculations are performed on a single computer.
Minimum abundance for a candidate among all sets (0-1). For instance, a value of 1 means that the candidate needs to be present in all the set data.
As setThreshold
, but only taking into account the set data that contain
annotations for the feature group of the candidate.
If TRUE
then set specific scorings (e.g. MS/MS match) are also
averaged.
A compoundsSIRIUS
object.
This function uses SIRIUS to generate compound candidates. This function is called when calling generateCompounds
with
algorithm="sirius"
.
Similar to generateFormulasSIRIUS
, candidate formulae are generated with SIRIUS. These results
are then fed to CSI:FingerID to acquire candidate structures. Candidate formulae without any assigned structure
will be removed (unlike generateFormulasSIRIUS
). This method requires the availability of MS/MS data,
and feature groups without it will be ignored.
For annotations performed with SIRIUS
it is often the fastest to keep the default
splitBatches=FALSE
. In this case, all SIRIUS
output will be printed to the terminal (unless
verbose=FALSE
or patRoon.MP.method="future"). Furthermore, please note that only annotations to be
performed for the same adduct are grouped in a single batch execution.
generateCompoundsSIRIUS uses multiprocessing to parallelize computations. Please see the parallelization section in the handbook for more details and patRoon options for configuration options.
Dhrkop2019patRoon
Duhrkop2015patRoon
Duhrkop2015-2patRoon
Bcker2008patRoon
generateCompounds
for more details and other algorithms.