Uses BioTransformer to predict TPs

generateTPsBioTransformer(
  parents,
  type = "env",
  generations = 2,
  maxExpGenerations = generations + 2,
  extraOpts = NULL,
  skipInvalid = TRUE,
  prefCalcChemProps = TRUE,
  neutralChemProps = FALSE,
  neutralizeTPs = TRUE,
  TPStructParams = getDefTPStructParams(),
  MP = FALSE
)

Arguments

parents

The parents for which transformation products should be obtained. This can be

  • a suspect list (see suspect screening for more information)

  • the output of screenSuspects in which case the suspects hits are used as parents

  • a compounds object in which case all candidates are used parents

The parents need to have SMILES or InChI information available.

type

The type of prediction. Valid values are: "env", "ecbased", "cyp450", "phaseII", "hgut", "superbio", "allHuman". Sets the -b command line option.

generations

The number of generations (steps) for the predictions. Sets the -s command line option. More generations may be reported, see the Hierarchy expansion section below.

maxExpGenerations

The maximum number of generations during hierarchy expansion, see below.

extraOpts

A character with extra command line options passed to the biotransformer.jar tool.

skipInvalid

If set to TRUE then the parents will be skipped (with a warning) for which insufficient information (e.g. SMILES) is available.

prefCalcChemProps

If TRUE then calculated chemical properties such as the formula and InChIKey are preferred over what is already present in the parent suspect list. For efficiency reasons it is recommended to set this to TRUE. See the Validating and calculating chemical properties section for more details.

neutralChemProps

If TRUE then the neutral form of the molecule is considered to calculate SMILES, formulae etc. Enabling this may improve feature matching when considering common adducts (e.g. [M+H]+, [M-H]-). See the Validating and calculating chemical properties section for more details.

neutralizeTPs

If TRUE then all resulting TP structure information is neutralized. This argument has a similar meaning as neutralChemProps. This is defaulted to TRUE for prediction algorithms, as these may output charged molecules. NOTE: if neutralization results in duplicate TPs, i.e. when the neutral form of the TP was also generated by the algorithm, then the neutralized TP will be removed.

TPStructParams

Parameters that influence the calculation of structural properties. See getDefTPStructParams.

MP

If TRUE then multiprocessing is enabled. Since BioTransformer supports native parallelization, additional multiprocessing generally doesn't lead to significant reduction in computational times. Furthermore, enabling multiprocessing can lead to very high CPU/RAM usage.

Value

The TPs are stored in an object derived from the transformationProductsStructure class.

Details

This function uses BioTransformer to obtain transformation products. This function is called when calling generateTPs with algorithm="biotransformer".

In order to use this function the .jar command line utility should be installed and specified in the patRoon.path.BioTransformer option. The .jar file can be obtained via https://bitbucket.org/djoumbou/biotransformer/src/master. Alternatively, the patRoonExt package can be installed to automatically install/configure the necessary files.

Note

When the parents argument is a compounds object, the candidate library identifier is used in case the candidate has no defined compoundName.

Hierarchy expansion

BioTransformer only reports the direct parent for a TP, not the complete pathway. For instance, consider the following results:

  • parent –> TP1

  • parent –> TP2

  • TP1 –> TP2

  • TP2 –> TP3

In this case, TP3 may be formed either as:

  • parent –> TP1 –> TP2 –> TP3

  • parent –> TP2 –> TP3

For this reason, patRoon simply expands the hierarchy and assumes that all routes are possible. For instance, Parent /- -\ /- -\ - - TP1 TP2 | | | | TP2 TP3 | | TP3

Note that this may result in pathways with more generations than defined by the generations argument. Thus, the maxExpGenerations argument is used to avoid excessive expansions.

Validating and calculating chemical properties

Chemical properties such as SMILES, InChIKey and formulae in the parent suspect list are automatically validated and calculated if missing/invalid.

The internal validation/calculation process performs the following steps:

  • Validation of SMILES, InChI, InChIKey and formula data (if present). Invalid entries will be set to NA.

  • If neutralChemProps=TRUE then chemical data (SMILES, formulae etc.) is neutralized by (de-)protonation (using the –neutralized option of OpenBabel). An additional column molNeutralized is added to mark those molecules that were neutralized. Note that neutralization requires either SMILES or InChI data to be available.

  • The SMILES and InChI data are used to calculate missing or invalid SMILES, InChI, InChIKey and formula data. If prefCalcChemProps=TRUE then existing InChIKey and formula data is overwritten by calculated values whenever possible.

  • The chemical formulae which were not calculated are verified and normalized. This process may be time consuming, and is potentially largely avoided by setting prefCalcChemProps=TRUE.

  • Neutral masses are calculated for missing values (prefCalcChemProps=FALSE) or whenever possible (prefCalcChemProps=TRUE).

Note that calculation of formulae for molecules that are isotopically labelled is currently only supported for deuterium (2H) elements.

This functionality relies heavily on OpenBabel, please make sure it is installed.

Parallelization

generateTPsBioTransformer uses multiprocessing to parallelize computations. Please see the parallelization section in the handbook for more details and patRoon options for configuration options.

References

OBoyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011). “Open Babel: An open chemical toolbox.” Journal of Cheminformatics, 3(1). doi:10.1186/1758-2946-3-33 .

Djoumbou-Feunang Y, Fiamoncini J, Gil-de-la-Fuente A, Greiner R, Manach C, Wishart DS (2019). “BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification.” Journal of Cheminformatics, 11(1). doi:10.1186/s13321-018-0324-5 .

Wicker J, Lorsbach T, Gutlein M, Schmid E, Latino D, Kramer S, Fenner K (2015). “enviPath - The environmental contaminant biotransformation pathway resource.” Nucleic Acids Research, 44(D1), D502–D508. doi:10.1093/nar/gkv1229 .

See also

generateTPs for more details and other algorithms.