NEWS.md
generateTPsLibrary()
/generateTPsLibraryFormula()
any TPs that are equal to the parent and are from a generation>1 are now removedgenerateCompoundsSIRIUS()
documentation that formula candidates without structure assignment are omitted (suggested by Nienke Meekel)removeTPIsomers
filter for transformationProductsStructure
didn’t actually apply the removeDuplicates
filter.updateScores=TRUE
for the methods addFormulaScoring()
, predictRespFactors()
and predictTox()
then NaN
scores could be introduced if the formulaScore
is zero.updateScores=TRUE
for the method addFormulaScoring()
then the score
would be updated twice.generateTPsLibrary()
/generateTPsLibraryFormula()
specifying >1 generation did not yield in additional TP searches if parents!=NULL
report()
would error with components generated by CAMERA (reported by Nienke Meekel)as.data.table()
method for featureGroups
: normConcToTox
argument was ignored (not for featureGroupsScreening
)generateAnalysisInfo()
: try to equalize the output and input directory orderdata.frame
, features
and featureGroups
class as: getTICs()
, getBPCs()
, plotTICs()
and plotBPCs()
.ion_formula_mz
column instead of taking it from SIRIUS data, as it rarely may not be available (issue #111)MS2QuantMeta
slotsWhen updating to this release, it is important to remove any cached data, i.e. by running clearCache("all")
or manually removing the cache.sqlite
file from your project directory.
predictRespFactors()
could incorrectly cache results and perform concentration conversions twice for calibrants under some circumstances (reported by Drew Szabo)verifyDependencies()
could throw errors when external dependencies were not found.predictRespFactors()
/predictTox()
: better handle objects without resultsMS2Quant
is now stored in the MS2QuantMeta
slots (suggested by Drew Szabo)calculateTox()
/calculateConcs()
: only consider relevant feature annotationscalculateConcs()
: avoid warnings when there are no feature groupsas.data.table()
methods for featureGroups
/featureGroupsScreening
:
features==TRUE
and collapseSuspects=NULL
features==TRUE
and/or collapseSuspects=NULL
replicate_groups
with incorrect data was included with features==TRUE
and average==TRUE
features==TRUE
and average==TRUE
collapseSuspects
argument for as.data.table()
method for suspect screening resultsThis release adds significant new functionality, several important changes and several bug fixes thanks to user feedback.
Users of previous patRoon
versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is important to remove any cached data, i.e. by running clearCache("all")
or manually removing the cache.sqlite
file from your project directory.
This release adds new way to install and update patRoon
and its dependencies. The most important changes are
patRoon
bundles: these are standalone installations of R
, patRoon
and its R
package dependencies, and all other external dependencies such as MetFrag, OpenJDK, OpenMS etc. This is primarily useful for users not familiar to R
or wanting to quickly try a new patRoon
release.patRoon
and its dependencies automatically, which prevents the need to manually install packages from different sources (BioConductor, GitHub etc). Furthermore, this package also installs patRoonExt
, another axuliary package that bundles most external dependencies (e.g. MetFrag, PubChemLite, OpenMS).patRoon_install
script is now replaced by the above installation methids, and is therefore considered deprecated, will be removed in the future and should therefore not be used anymore.For more information, please read the updated installation chapter in the handbook, and see the project pages of patRoonInst and patRoonExt.
IMPORTANT If you installed
patRoon
via the legacy installation script, please read the installation chapter to disable/remove this installation prior to updatingpatRoon
!
The second milestone of this release is the integration of the MS2Tox and MS2Quant R
packages, which support machine learning approaches to predict the toxicity and concentration of features. The integration adds the following functionality to patRoon
:
SMILES
.as.data.table()
function and reporting interface were updated to inspect the predicted toxicities/concentrations.Please see the relevant section in the handbook, and the project pages of MS2Tox and MS2Quant for more details.
loadMSLibrary()
: improve compatibility with more .msp
flavors (issue #72).newProject()
: save/load parameters to reproduce subsequent project creations (issue #61)groupFeaturesOpenMS()
: now supports handling large numbers of analyses on Windows (reported by Geert Franken, fix thanks to https://github.com/OpenMS/OpenMS/issues/6845).patRoon.checkCentroided
to control whether analyses files are verified to be centroided (suggested by Geert Franken).overlap()
: the which
param can now also be a list
to compare groups of replicate groups (similar to plotVenn()
)consensus()
method for featureGroups
: new verifyAnaInfo
flag to optionally skip if the analysis information are equal for all compared objects. This is mainly useful when the data is the same but in different formats.genReportSettingsFile()
: baseFrom
argument to update old report settings files.generateFormulasSIRIUS()
: new getFingerprints
and token
arguments to download CSI:FingerID fingerprints for formula candidates. This was primarily implemented to support calculating toxicities/concentrations from formula annotations.report()
now correctly handles SIRIUS compounds results and suspects without SMILESbslib 0.5.0
(reported by Alessia Ore)annotatedBy
filter for MSPeakLists
could remove precursor peaks in MS/MS data regardless if retainPrecursorMSMS=TRUE
report()
: The suspect(s)
column for compound annotation results was always emptyreAverage
argument was ignored by the filter()
method of MSPeakLists
when checking if cached data is available (issue #87)reAverage=TRUE
to the filter()
method of MSPeakLists
then the peak IDs were not regenerated (issue #87)plotSpectrum()
method for formulas
/compounds
didn’t expand plot height for formula annotations with only one mass peakannotatedPeakList()
/plotSpectrum()
methods for compounds
didn’t label mass peaks with compounds algorithm if formulas
were provided but no formula candidate was present.generateFormulas()
generic definition had wrong argument ordergetSIRIUSToken()
resulted in errors if the password input was cancelled.This release adds significant new functionality, several important changes and many bug fixes thanks to user feedback.
Users of previous patRoon
versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is important to remove any cached data, i.e. by running clearCache("all")
or manually removing the cache.sqlite
file from your project directory.
The most significant change in this release is the addition of redesigned reporting functionality. Some key functionality and changes:
html
interface was completely redesigned to provide a modern, responsive and easier to use interface, which is powered by the bslib
and reactable
R
packages.SVG
vector graphics, which are generally smaller in size, faster to create and can be zoomed in without loss of quality.YAML
file.An example can be seen from the report output of the tutorial.
The new reporting interface is used with the new report()
method function. All the documentation was updated to reflect these changes. The now ‘legacy’ report interface (reportCSV()
, reportPDF()
and reportHTML()
) is still available for backwards compatibility and may still be of interest as the new interface currently only supports the HTML format.
The new reporting functionality obviously did not yet underwent years of usage and feedback. Hence, please report any bugs and suggestions you may have!
Active logins are now necessary to use webservices such as CSI:FingerID, see e.g. https://boecker-lab.github.io/docs.sirius.github.io/account-and-license/ This release of patRoon
adds support to make logging in more easy and adds several compatibility fixes for the latest SIRIUS
version. The new utility function getSIRIUSToken()
can be used to obtain a necessary login token. The new token
argument for generateCompoundsSIRIUS()
can be used to automatically log in. The newPorject()
function was extended to use this new functionality.
The Docker images are now served by the GitLab server of the University of Amsterdam. To pull the latest images you can run the following command:
docker pull uva-hva.gitlab.host:4567/r.helmus/patroon/patroonrs
The changes are reflected in the installation section of the handbook.
A new algorithm for generateTPs()
was added: library_formula
. This algorithm is similar to the library
algorithm, but only works with chemical formulae. This is especially useful if only formula data is available for parents and/or TPs. The genFormulaTPLibrary()
utility function can be used to automatically generate a formula library from given transformation rules. More information can be found in the updated handbook and reference manual (?generateTPsLibraryFormula
).
Other important changes include:
topMost
and onlyPresent
, are now combined in a parameter list. The parameter list is specified with the new EICParams
argument to functions such as plotChroms()
and report()
. A list with default parameter values is generated with the getDefEICParams()
function. More information can be found in the reference manual: ?EICParams
.plotChroms()
method for featureGroups
was changed.makeSet()
method for featureGroups
(and related functions adducts()
and selectIons()
): the original set specific feature groups are now combined to create the final feature groups, instead of grouping features from all sets at once. This prevents rare cases where features with different adduct assignments in the same set would be grouped together (i.e. if their neutral mass would be the same). Note that this change probably will produce slightly different results. This change required the addition of a new slot annotationsChanged
to featureGroupsSet
for internal usage by the adducts()<-
method.setAvgSpecificScores
arguments of generateFormulas()
/generateCompounds()
to TRUE
.neutralChemProps
/neutralizeTPs
arguments. Whether a structure was neutralized is marked by the new molNeutralized
column.
neutralizeTPs
is set and a neutralization of a TP results in a duplicate structure (i.e. in case the algorithm also generated the neutral form of the TP) then the neutralized TP is removed.newProject()
: added possibility to exclude analyses out of folder (issue #60, #63)as.data.table()
for featureGroups
regression=TRUE
: add column with p valuesfeatures=TRUE
: add replicate group columnplotChroms()
: analysis
, groupName
and intMax
argumentsdelete()
method function was added to modify MS peak liststhrMS
, thrMSMS
, thrComb
and maxCandidates
arguments, which can be used to tweak calculations for features with many candidates, e.g., to limit calculation times.and
keyword in the YAML
configuration file. This is especially useful when combined with the or
keyword.adduct_abundance
columnplotInt()
method for components: index
argument can now also be component namegenerateTPsCTS()
: support new PFAS libraries (set "pfas_environmental"
or "pfas_metabolism"
as the transLibrary
argument).generateTPsLibrary()
: the matchParentsBy
argument now also accepts "formula"
and "name"
.retDir
column that specifies the retention time direction of the TP compared to its parent (alternative to specifying log P
values).matchGenerationsBy
to the library
(and library_formula
) algorithm for generateTPs()
, which controls how parents/TPs are matched when searching multiple transformation generations.maxExpGenerations
argument to generateTPsBiotransformer
to avoid excessive TP hierarchy expansions.generateTPsCTS()
: support new PFAS libraries (set "pfas_environmental"
or "pfas_metabolism"
as the transLibrary
argument).generateComponentsTPs()
the formulaDiff
column now splits elemental losses and gains, similar as the plotGraph()
method already did for TPs.delete()
method function was added to modify transformationProducts
plotInt()
methods: plotArgs
and linesArgs
to pass additional arguments to plot()
/lines()
. The latter replaces the dots argument.plotGraph()
methods
width
and height
arguments.transformationProductsStructure
now draw structures in SVG format to improve qualityclearCache()
: vacuum
option to speed up clearing large cache files.as.data.table()
for featureGroups
with regression=TRUE
: treat missing features as NA
plotChord()
method for featureGroups
: significantly optimized some old codeplotChroms()
/ EIC loading
plotScores()
plotSpectrum()
for sets workflows better handles missing data from one or more sets when making a comparison, which avoids empty plots in such cases.annotatedPeakLists()
, especially with sets workflows.specSimParams
argument replaces the relMinMSMSIntensity
and simMSMSMethod
arguments.annotateSuspects()
: log if the suspect formula/compound data could not be matched with feature annotationsformulaDiff
column in TP component results was changed to more easily identify elemental losses/gains.fromTPs
slot was added to TP components and is TRUE
if a transformationProducts
object was used during componentization. This is primarily for internal use.plotGraph()
methods: show empty plot instead of throwing an error if results are emptyprefCalcChemProps=FALSE
reportHTML()
EICOnlyPresent
argument to reportHTML()
is effective againreportHTML()
could show plots of wrong TP resultsnewProject()
newProject()
for sets mode used c()
instead of list()
to specify positive+negative suspect lists to screenSuspects()
rstudioapi::getSourceEditorContext()
(issue #62)newProject()
used wrong variable name for suspect list under some conditions (issue #69)analysis.csv
already exists when needednorm_conc
field for analysis information was ignored (reported by Geert Franken)checkFeatures()
/checkComponents()
: disabling a feature/featureGroup in a sorted table would lead to wrong selectionsselectIons()
does not throw an error anymore if there is no suitable adduct/isotope information in the given components, which would result in incorrect behavior with sets mode if e.g. no annotations were found for one of the sets.predictCheckFeaturesSession()
marked passing peaks to be removed instead of the other way around (issue #59)selectIons()
didn’t properly handle empty components objectschromPeaks()
from xcms
was sometimes not found (issue #68)calculatePeakQualities()
would throw errors for empty feature results (reported by Louise Malm)plotChroms()
/ EIC loading: group rectangle with topMost set didn’t consider retention times and intensities from other featurestraceSNRFiltering
argument could not be set for findFeaturesOpenMS()
plotChroms()
now better supports plotting chromatograms for analysis without feature data (ie when onlyPresent=FALSE
) in sets workflows. This now correctly works for cases where a feature is completely absent in a set.filter()
method for MSPeakLists
where precursor isolation (isolatePrec
argument) also applied to MS/MS data (issue #56).generateCompoundsMetFrag()
didn’t properly detect changes in local database files when considering cached data.MSPeakLists
without any results could lead to errors.scoreTypes
slot could contain scorings not actually used, e.g. if the scoreTypes
argument to generateCompoundsMetFrag()
contained scorings not actually present in the used database.addFormulaScoring()
: updateScore
argument was ignored and treated always as TRUE
annotateSuspects()
(issue #54). SuspectshigherThanNext
setting from estimated ID level 4ascreenSuspects()
would fail if the adduct column contains partially NA
data.unset()
for featureGroupsScreeningSet
resulted in loss of group quality scores and internal standard assignmentsannotateSuspects()
generateCompoundsLibrary()
if the library did not contain peak formula annotations.fragments_formula
was not split per set in the screening results.convertMSFiles()
if the analysisInfo
argument is set and outPath
is set with a length >1 then the wrong output path could be used.This release extends version 2.0
with new functionality, several important changes and bug fixes. The newProject()
function was updated for the new functionality. Please see the updated Handbook and sections below for more information.
Users of previous patRoon
versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is highly recommended to remove any cached data, i.e. by running clearCache("all")
or manually removing the cache.sqlite
file from your project directory.
generateTPs()
function now supports an additional algorithm that interfaces with the Chemical Transformation Simulator (CTS). An important advantage of this algorithm is that it supports several abiotic transformation pathway libraries.plotGraph()
generic function. Furthermore, this function can incorporate componentization results to easily display which TPs are present in the screening results.transformationProductsStructure
, is now used to store results for algorithms that provide structural information (biotransformer
, library
and cts
). This better harmonizes the functionality between algorithms (e.g. with the filter()
method function).plotVenn()
, plotUpSet()
and consensus()
methods are now available to compare and combine TP data.convertToMFDB()
/generateComponentsTPs()
don’t include any columns anymore that are specific to the transformation pathway.?generateTPsBiotransformer()
) for details.patRoon
. A new method function, normInts()
now supports various normalization methods, such as normalization by internal standards and the TIC. With internal standard normalization, the plotGraph()
function can be used to interactively evaluate which internal standards were automatically assigned to each feature group.normInts()
function now handles all normalization and stores normalized intensities/areas in the feature data.as.data.table()
, plotInt()
etc) now have a new normalized
argument, which should be TRUE
to use normalized data. The normFunc
argument to these functions was removed since it is not necessary anymore.normalized=TRUE
and normInts()
was not called on the feature data, a simple automatic default normalization is done. This is primarily for backwards compatibility.as.data.table()
can now report normalized values for averaged feature data (if (features
&& average
&& normalized
) == TRUE)removeISTDs
argument for filter()
to remove feature groups that are assigned as internal standards.norm_conc
) that influences normalization calculations. The generateAnalysisInfo()
function can now initialize this data.ISTDs
and ISTDAssignments
slots and their accessor methods internalStandards()
and internalStandardAssignments()
to store/access the internal standard assignment data.libMatch
).SMILES
, InChI
and formulas for e.g. suspect lists was significantly changed. More data is now verified, and several optimizations were implemented to better handle large suspect lists or MS libraries. Note that minor changes in neutralMass
values may be observed. For more details please see the reference manual (e.g. ?screenSuspects
).filter()
method was defined for the transformationProducts
class to filter generic properties.calcSims
argument to the generateTPs
functions: if TRUE
then structural similarities will be calculated between parents/TPs.library
algorithm now caches its results and supports multiple transformation generations (generations
argument).reportHTML()
TPs
argument).generateFormulasSIRIUS()
/generateCompoundsSIRIUS()
: projectPath
and dryRun
arguments. These are mainly for internal use.getEICs()
utility to obtain raw EIC data (suggested by Ricardo Cunha).biotransformer
calcSims
argument to TRUE
(see above).parent_
columns (parent_SMILES, etc)steps
argument was renamed to generations
for consistency with other algorithms.library
algorithm: naming of TPs is similarly done as other algorithms. The library TP names are now stored in the name_lib
column.onlyLinked
argument from the plotGraph()
generic. This was done as the new plotGraph()
methods don’t support this argument. Note that the argument was only removed from the generic, the original plotGraph()
methods still support the argument.generateCompoundsSIRIUS()
: removed unused errorRetries
argumenttopMost
filter applied during MS peak list averagingconvertMFDB()
now always collapses duplicates, not just for biotransformer
results.biotransformer
: retDir
is now derived from the original parent, i.e. not its direct parentreportHTML()
now properly subscripts negative element counts in formulasreportHTML()
improve handling of missing or split compound identifiers when generating URL linkscompounds
: avoid _unset suffixes in mergedBy column from data of sets workflowsnewProject()
: loading analysis info from CSV now works again on WindowsNA
exit codes on Linux systemsgenerateCompoundsSIRIUS()
: topMost
argument was used where topMostFormulas
was supposed to be usedas.data.table()
method for featureAnnotations
would throw an error for empty results with OM=TRUE
removeBlanks
feature groups filter would not handle analyses with multiple blanks.enviPick
optional dependency and added instructions to install from GitHub, as it was removed from CRAN.newProject()
would not add suspect annotation to the output script if the example suspect list or sets workflows were chosen.min_width
was incorrect (PR #31, thanks to @@coltonlloyd)installPatRoon()
improvements in determining what is already installedfeatureGroups
objects after calling screenSuspects()
or unset()
generateAnalysisInfo()
) is now case insensitive (see issue #34 and #43)newProject()
: reportCSV()
call in generated script included non-existing MSPeakLists
argument.adduct
argument was specified (Corey Griffith)findFeatures
, findFeaturesKPIC2
) are now documented on separate pages.featureGroups
are now documented in a separate page (?feature-plotting
)findFeaturesKPIC2()
and importFeaturesKPIC2()
now have correct casing (was lower case ‘f’)convertMSFiles()
were not properly verifiedanalysis-information
(issue #33)featureGroups
was not updated when removing groups with delete()
(except sets workflows)nontarget
an optional dependency and install it from GitHub with CI and in the installation docs (see issue #48)newProject()
ignored group/blank input for sets workflowsmz
/rt
columns are not numericcheckFeatures()
: don’t show multiple rows if a suspect was matched with multiple feature groups. This change removed the option to show specific suspect columns.checkFeatures()
errored if Plot mode was ‘Top most replicates’ or ‘All’topMost
plotting of EICs for sets dataplotChroms()
: peak area filling (showPeakAreas=TRUE
) didn’t work if the peak height exceeded ylim
screenSuspects()
with sets workflows: don’t warn about set specific suspect data if all data is NAcheckFeatures()
/checkComponents()
now cleanup unavailable selections when saving the sessionreportHTML()
: Don’t try to report TP components if no data is availableformulasSet
method for plotSpectrum()
: don’t try to plot a comparison plot for candidates without MS/MS datareportHTML()
don’t try to plot a comparison plot for formula candidates without MS/MS dataThis release adds a significant amount of new functionality and changes. Please see the updated Handbook and sections below for more information.
Users of previous patRoon
versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is highly recommended to remove any cached data, i.e. by running clearCache("all")
or manually removing the cache.sqlite
file from your project directory.
exportedData
to loadRawData
.mzWindow
and EICMzWindow
arguments were renamed to mzExpWindow
/ EICMzExpWindow
and are now with slightly different meaning (please see the reference manual).minFWHM
/maxFWHM
defaults lowered for findFeatures
and feature optimization.useGGPlot2
argument) is removed (not often used and a maintenance burden).precursor
argument to the plotSpectrum()
, annotatedSpectrum()
and plotScores()
methods for formulas
now expects the neutral formula instead of the ionized formula. This change was done for consistency with compound annotations and sets workflows.featThreshold
.featThresholdAnn
, only takes annotated features into account.featThreshold
is now 0
, for featThresholdAnn
it is the same as the previous default for featThreshold
.analysis
column to analysis_from
and added analyses
column that lists all analyses from which the consensus was made.formulas
and compounds
classes now derive from the new featureAnnotations
class. Most of the functionality common to formulas/compounds are defined for this class.maxFormulas
/maxFragFormulas
argument for as.data.table()
were removed, as these don’t make much sense with the new format.elements
filter now applies to neutral formulae for both formula and compound annotations (fragElements
still applies to the ionized fragment formula).MSPeakLists
. Since all algorithms now require peak lists, generateFormulas
now has a mandatory MSPeakLists
argument (similar to generateCompounds
).ion_formula
(ionized) and neutral_formula
(neutral) columns. Similarly, the formula_mz
column was renamed to ion_formula_mz
.The most important new functionality in patRoon 2.0
are transformation product (TP) screening workflows. This release adds functionality to predict TPs (with BioTransformer
or metabolic logic) or search TPs in PubChem
or custom databases. Furthermore, other data such as MS/MS similarity or feature classification data can be used to relate parent/TP features. Other TP screening functionality includes TP prioritization and automatic generation of TP compound database for MetFrag
annotation. The workflows follow the classical design of patRoon
, where flexible workflows can be executed with a combination of established algorithms and new functionality. For more information, please see the dedicated chapter about TP screening in the Handbook.
Another major change in this release is the addition of sets workflows. These workflows are typically used to simultaneously process positive and negative ionization data. Advantages of sets workflows include simplification of data processing, combining positive and negative data to improve e.g. feature annotations and easily comparing features across polarities. A sets workflow requires minimal changes to a ‘classical workflow’, and most of the additional work needed to process both polarities is done automatically behind the scenes. For more information, please see the dedicated chapter about sets workflows in the Handbook.
The following new feature detection/grouping algorithms were integrated: SIRIUS
, KPIC2
and SAFD
. Furthermore, integration with MetaClean
was implemented for the calculation of peak qualities and machine learning based classification of pass/fail peaks. In addition, the peak qualities are used to calculate peak scores, which can be used for quick assessment and prioritization.
Interactive curation of feature data with checkChromatograms()
was replaced with checkFeatures()
, which is much faster, is better suitable for larger datasets, customizable and has an improved user interface. Furthermore, this tool can be used for training/assessing MetaClean
models. Similarly, checkComponents()
is a function that allows interactive curation of component data.
The delete()
generic function allows straightforward deletion of parts of the workflow data, such as features, components and annotations. Furthermore, this function makes it easy to implement customized filters.
The algorithms of OpenMS
(MetaboliteAdductDecharger
) and cliqueMS
were integrated for additional ways to detect adducts/isotopes through componentization. Furthermore, the new selectIons()
method uses these annotations to prioritize features (e.g. by only retaining those with preferable adducts). In addition, this function stores the adduct annotations for the remaining feature groups, which can then be automatically used for e.g. formula and compound annotation.
newProject
as.data.table()
normFunc
argument)averageFunc
argument)FCParams
argument)qualities
argument)plotVolcano()
method function to plot fold changes.topMostByRGroup/EICTopMostByRGroup
arguments for plotting/reporting EIC data of only the top most intense replicate(s).reportHTML()
now only plots the EIC of the most intense feature of each replicate group (i.e. EICTopMostByRGroup=TRUE
and EICTopMost=1
).XCMS3
loadRawData
argument for feature grouping and comparison()
...
argument for findFeaturesXCMS3
preGroupParam
to specify grouping parameters used prior to RT alignment (suggested by Ricardo Cunha)XCMS
feature (group) objects are synchronized as much as possible when feature data is changed.show()
and filter()
methods.OpenMS
feature finding: useFFMIntensities
argument to speed up intensity loading (experimental).reportHTML()
now reports general feature information in a separate tab.results
argument to [
(subset) and filter()
to quickly synchronize feature groups between objects (e.g. to quickly remove feature groups without annotation results).plotSpectrum()
to automatically calculate the space necessary for formula annotation texts and candidate structures was improved. Annotation texts are now automatically resized if there is insufficient space, and the maximum size and resolution for candidate structures can be controlled with the maxMolSize
/molRes
parameters.filter()
method for MSPeakLists
: minMSMSPeaks
filter to only retain MSMS peak lists with a minimum number of peaks.filter()
method for MSPeakLists
: annotatedBy
filter to only keep peaks with formula/compound annotations.screenSuspects()
method now supports the amend
argument, which allows combining results of different screenSuspects()
calls (see the Handbook for details).specclust
, which generates components based on hierarchically clustering feature groups with high MS/MS similarities.groupFeatures
: renamed the feat
argument to obj
.reportHTML()
: EICs are shared between tabs to avoid duplicated plottingfeatures
object embedded in featureGroups
objects is now synchronized, and any features not present in any group are removed accordingly. This reduces memory usage and indirectly causes reportCSV()
to only report features still present.plotInt()
: now has xnames
and showLegend
arguments to adjust plotting.[
(subset) and filter()
methods for MSPeakLists
now only re-average peak lists if the new reAverage
argument is set to TRUE
(default FALSE
). This change was mainly done as (1) the effects are usually minor and (2) re-averaging invalidates any formula/compound annotations done prior to filtering.filter()
method for MSPeakLists
: the withMSMS
filter is now applied after all other filters.MetFrag
: the raw unprocessed annotation formulas are now additionally stored in the fragInfo
tables (formula_MF
column).MetFrag
: the precursor ion m/z is now taken from peak list data instead of the feature group to improve annotation.MetFrag
: the useSmiles
parameter is now set to true
as this seems to improve results sometimes.as.data.table()
method for formulas
: if average=TRUE
then all column data that cannot be reasonably averaged are excluded.annotatedPeakList()
: also add annotation columns for missing results (for consistency).minMaxNormalization
argument to the consensus()
method for compounds
was removed (unused).filter()
for formulas
/compounds
: if algorithm consensus results are filtered with scoreLimits
, and a score term exists multiple times for a candidate, only one of the terms needs to fall within the specified limits for the candidate to be kept (was all).plotSpectrum()
for compounds
: plotStruct
is now defaulted to FALSE
.MSPeakLists
data now store an unique identifier for each mass peak in the ID
column. These IDs are used by e.g. formula/compound annotations, and stored in the PLID
column in their fragInfo
data. This replaces the PLIndex
column in fragInfo
data, which was only row based, and therefore invalidated in case peak lists were filtered afterwards.GenFormAdducts()
and MetFragAdducts()
now additionally return adducts in generic format and use cached data for efficiency.err
argument to as.character()
to control if an error or NA
should returned if conversion fails.as.adduct()
now removes any whitespace and performs stricter format checks to make conversion more robust.MetaboliteAdductDecharger
) and cliqueMS
.calculateIonFormula()
and calculateNeutralFormula()
now Hill sort their resultas.data.table()
: Suspect screening specific columns are now prefixed with susp_
.suspFormRank
and suspCompRank
suspect annotation data columns were renamed to formRank
/suspCompRank
(the previous change made prefixing unnecessary).logPath
argument for annotateSuspects()
to specify the file path for log files are disable logging completely.fastcluster
for hierarchical clustering.rt
to ret
for consistency.show()
: show unique feature group counts.filter()
: allow negative rtIncrement
values.nontarget
: replaced extraOpts
argument with ...
.nontarget
: store links as character string indices instead of numeric indices.RAMClustR
: moved position of ionization
argument to improve consistency.componentsReduced
class) when filtering/subsetting components, was removed. This system was quite unintuitive and imposed unnecessary limitations. Instead, functions that cannot work after component data is changed (e.g. those specific to intensity clustering) will throw an error if needed.intclust
components are now derived from a general componentsClust
class, which is shared with specclust
components. The common functionality for both algorithms is implemented for this class.show()
methods now print class inheritance treeprogressr
package is not used anymore, thus, it is not necessary to set up progress bars with future
based multiprocessing.newProject
: Moved order of componentization step (now before annotation & suspect screening).xlim
/ylim
was used with plotChroms
then peaks were not always correctly filledretMin
argument to plot()
method for featureGroupsComparison
wasn’t properly used/defaulted.plotSpectrum()
if xlim
is set and this yields no data then an empty plot is shown.plotSpectrum()
automatic ylim
determination was incorrect if only one peak is shown.scoreLimits
filter for formulas could ignore results not obtained with MS/MS data.as.data.table(compounds, fragments=TRUE)
returned empty results for candidates without fragment annotations.topX
arguments for the MSPeakLists
method for filter()
would re-order peak lists, thereby invaliding any annotations.SIRIUS
‘adduct fragments’.generateMSPeakListsDAFMF()
potentially used wrong DA compound data in case features were filtered.numericIDLevel()
now properly handles NA
values.importFeatureGroupsBrukerTASQ()
: Improved handling of absent analyses in imported results files.[M-H]-
), resulting in ~1.5 mDa deviations[2M+H]+
) and multiple charges (e.g. [M+2H]2+
)RAMClustR
: ensure that columns are the right type if all values are NA.CAMERA
: correctly handle cases when minSize
filter results in zero components.plotGraph()
: improve error handling with empty objects.newProject()
: correctly handle DIA with Bruker MS peak lists.peakgroups
alignment method was used (fixes issue #22)mapply
warning was shown with newProject()
newProject()
: don’t show Remove button in analyses select screen when the script option is selected, as this will not work properly.IPO
: add default limits for OpenMS traceTermOutliers
IPO
optimization fix: integer parameters are properly roundedgenerateFeatureOptPSet("xcms3", method="matchedFilter")
would return a parameter set with step
instead of binSize
(issue #23)newProject()
would generate an ID levels configuration file even when no suspect list was selected.NULL
values that may occasionally be returned by rcdk::get.mcs
reportHTML()
annotation table was paged.analysisInfo
validity may result in an error (reported by Tiago Sobreira)convertMSFiles()
error with dirs=TRUE
(reported by Tiago Sobreira)installPatRoon()
screenSuspects()
did not take onlyHits
into account for cachingscreenSuspects()
: The original suspect name is stored in the name_orig
columnXCMS3
feature group optimization: binSize
and minFraction
values were rounded while they shouldn’t (issue #27)This releases focuses on a significantly changed suspect screening interface, which brings several utilities to assist suspect annotation, prioritization, mixing of suspect and full NTA workflows and other general improvements.
IMPORTANT: The suspect screening interface has changed significantly. Please read the documentation (?screenSuspects
and the handbook) for more details. If you want to quickly update your code without using any new functionality:
Change your existing code, e.g.
scr <- screenSuspects(fGroups, suspectList, ...)
fGroupsScr <- groupFeaturesScreening(fGroups, scr)
to
fGroupsScr <- screenSuspects(fGroups, suspectList, ..., onlyHits = TRUE)
Major changes
onlyHits=TRUE
). This allows straightforward mixing of suspect and full non-target workflows.screenInfo()
or as.data.table()
methods.suspects
argument to [
, e.g. fGroupsScr[, suspects = "carbamazepine"]
annotateSuspects()
, allows combining the annotation workflow data (peak lists, formulas, compounds) to perform a detailed annotation for the suspects found during the workflow. This method calculates properties such as
filter()
method for suspect screening results, which allows you to easily prioritize data, for instance, by selecting minimum annotation ranks and similarities, identification levels and automatically choosing the best match in case multiple suspects are assigned to one feature (and vice versa).as.data.table()
method and reporting functionality for suspect screening results to quickly inspect their annotation data.?screenSuspects
and ?annotateSuspects
for more information.screenInfo()
).m/z
), these can be included in the suspect list to improve suspect annotation.features
objects was removed. The same and much more functionality can be obtained by the workflow for feature groups.reportCSV()
function was simplified and uses as.data.table()
to generate the CSV data. This should give more consistent results.individualMoNAScore
MetFrag scoring is now enabled by default.Other changes
reportHTML()
now allows toggling visibility for the columns shown in the feature annotation table.plotVenn()
method for featureGroups
now allows to compare combinations of multiple replicate groups with each other. See ?plotVenn
for more information.SIRIUS
binary on macOS
did not work properlyGenForm
resulted in an error (https://github.com/rickhelmus/patRoon/issues/18)plotEIC()
, groups()
and plotSpec()
methods were renamed to plotChroms()
, groupTable()
and plotSpectrum()
. This was done to avoid name clashes with XCMS
and CAMERA
. The old functions still work (with a warning), but please update your scripts as these will be removed in the future.patRoon
now supports an additional method to perform parallelization for tools such as MetFrag
, SIRIUS
etc. The main purpose of this method is to allow you to perform such calculations on external computer clusters. Please see the updated parallelization section in the handbook for more details.logPath
and maxProcAmount
arguments to functions such generateFormulas
, generateCompounds
etc were removed. These should now solely be configured through package options (see ?patRoon
).patRoon.maxProcAmount
package option was renamed to patRoon.MP.maxProcs
.SIRIUS
calculateFeatures=TRUE
would try to calculate formulas for features even if not present (eg after being removed by subsetting or filtering steps).SIRBatchSize
argument to formula and compound generation functions was renamed to splitBatches
and its meaning has slightly changed. Please see the reference manual (e.g. ?generateFormulas
) for more details.generateCompoundsMetfrag
was renamed to generateCompoundsMetFrag
.withOpt()
to temporarily change (patRoon
) package options.printPackageOpts()
: display current package options of patRoon
.OpenMS
: potentially large temporary files are removed when possible to avoid clogging up disk space (especially relevant on some Linux systems where /tmp
is small).XCMS
are not attached by default, which significantly speeds up loading patRoon
(e.g. with library()
).compoundViewer()
function was marked as defunct, as it hasn;t been working for some time and its functionality is largely replaced by reportHTML()
.generateComponentsNontarget()
: update homolog statistics for merged series.checkChromatograms()
: fix error when fGroups
has only one replicate groupconvertMSFiles()
: If algorithm="pwiz"
and vendor centroiding is used then any extra filters are now correctly put after the peakPicking
filter.getXCMSnExp()
is now properly exported and documented.annoTypeCount
score for annotated compounds with PubChemLite is now not normalized by default anymore when reporting results.reportHTML()
now correctly handles relative paths while opening the final report in a browser.componentsNT
: include algorithm data returned by nontarget::homol.search
in homol
slot (suggested by Vittorio Albergamo)convertMSFiles()
fixes (issue #14)
cwt
option is now available for conversion with ProteoWizardfeatures
objectsgenerateCompoundsMetFrag()
: compound names could be sometimes be interpreted as dates (reported by Corey Griffith)SIRIUS
annotation didn’t use set adduct but used default insteadSIRIUS
results are better handled if choosen adduct is not [M+H]+
or [M+H]+
data.table
objects properly from cache.plotGraph()
didn’t properly handle components without linked series (reported by Vittorio Albergamo)sn
column) (suggested by Ricardo Cunha)exportedData
/verbose
to getXCMSSet()
functions to avoid ambiguitiesgenerateComponentsNontarget()
: allow wider m/z deviation for proper linkage of different series (controlled by absMzDevLink
argument).addAllDAEICs()
sometimes used wrong names for EICsreportPDF()
may report formula annotated spectra of results not present in input featureGroups
data.table
data from cache now calls data.table::setalloccol()
to ensure proper behavior if data.table::set()
is called on cached data.compounds
with useGGPlot2=TRUE
would try to plot formulas for non-annotated peaks (resulting in many extra diagonal lines)reportPDF()
where not properly placed in a grid (as specified by EICGrid
argument)reportHTML()
retMin=TRUE
newProject()
didn’t show polarity selection if only a compound identification algorithm was selected.groupFeaturesXCMS3()
didn’t properly cache results.MSPeakLists
: results for averaged peak lists are now the same order as the input feature groupsSIRIUS
support
SIRIUS
(configurable with new SIRBatchSize
function argument). This dramatically improves overal calculation times (thanks to Markus Fleischauer for pointing out this possibility!).generateFormulasSirius()
and generateCompoundsSirius()
are now properly capitalized to generateFormulasSIRIUS()
and generateCompoundsSIRIUS()
SIRIUS
4.4.SIRIUS
output is directly shown on the console.SIRIUS
can be specified with the cores
function arguments.SIRIUS
groupNames()
, analyses()
and similar methods sometimes returned NULL
instead of an empty character
vector for empty objects.plotHeatMap()
with interactive=TRUE
: switch from now removed d3heatmap
package to heatmaply
reportHTML()
didn’t split PubChem URLs when multiple identifiers were reported.PWizBatchSize
argument for convertMSFiles()
extraOptsRT
/extraOptsGroup
arguments for OpenMS feature grouping to allow custom command line options.importFeatureGroupsBrukerTASQ
plot()
method for featureGroups
now allows drawing legends when colourBy="fGroups"
and sets colourBy="none"
by default, both for consistency with plotEIC()
.newProject()
now uses XCMS3 algorithms instead of the older XCMS interface.xcms
(not xcms3
) could not be subset with zero analyses (which resulted in errors by e.g. unique()
and reportHTML()
). Reported by Corey Griffith.formulas
/compounds
objectsnewProject()
dialogaddTrivialNames
option as it never worked very well.reportHTML()
: only components with reported feature groups are now reported.m/z
values. Instead, suspect lists can contain SMILES, InChI or neutral mass values which are used for automatic ion m/z
calculation. See ?screenSuspects
for more details.consensus()
newProject()
UI only showed partial amount of rows.addFormulaScoring()
function now uses a different algorithm to calculate formula scores for compound candidates. The score is now based on the actual formula ranking in the provided formulas
object, and is fixed between zero (no match) and one (best match).convertMSFiles
correctly checks if input existsmaxProcAmount
(i.e. number of parallel processes) now defaults to amount of physical cores instead of total number of CPU threads.batchSize
to 8
for GenForm formula calculation.plot()
for featureGroups
can now highlight unique/shared features across replicates (suggested by V Albergamo)plotGraph()
concs
option for generateAnalysisInfo()
to set concentration datafeatureGroupsComparison
can be customized (useful for e.g. plotting)%>%
)topMost
argument for GenForm formula calculation.ref
to blank
. Similarly, the refs
argument to generateAnalysisInfo()
is now called blanks
.reportMD()
is renamed to reportHTML()
filter()
method for formulas
: minExplainedFragPeaks
is now called minExplainedPeaks
screenTargets
and its targets
parameter have been renamed to screenSuspects()
/ suspects
groups()
and as.data.table()
methods for featureGroups
: optionally consider feature areas instead of peak intensities.plotSilhouettes()
method for compoundsCluster
rGroups
argument to subset operator for featureGroups
to subset by replicate groups (equivalent to rGroups
argument to filter()
).GenForm
formula calculation with MSMode="both"
(the default): instead of repeating calculations with and without MS/MS data and combining the data, it now simply does either of the two depending on MS/MS data availability. The old behavior turned out to be redundant, hence, calculation is now a bit faster.GenForm
now perform precursor isolation to cleanup MS1 data prior to formula calculation. During this step any mass peaks that are unlikely part of the isotopic pattern of the feature are removed, which otherwise would penalize the isotopic scoring. The result is that isotopic scoring is dramatically improved now. This filter step is part of new filter functionality for MSPeakLists
, see ?MSPeakLists
and ?generateFormulas
for more information.?formulas
).consensus()
(absMinAbundance
and relMinAbundance
)MetFrag
: for-ident database and new statistical scores are now supportedas.data.table()
/ as.data.frame()
for featureGroups
now optionally reports regression information, which may be useful for quantitative purposes. This replaces the (defunct) regression()
method and limited support from screenTargets()
.plotGraph()
method to visually inspect linked homologous series.newProject()
(e.g. loading of example data).reportMD()
: most time consuming plots are now cached. Hence, re-reporting should be signficiantly faster now.convertMSFiles()
now (optionally) takes analysis information (anaInfo
) for file input.convertMSFiles()
now supports Bruker DataAnalysis as conversion algorithm (replaces now deprecated exportDAFiles()
function).MSFileFormats()
function to list supported input conversion formats.generateAnalysisInfo()
now recognizes more file formats. This is mainly useful so its output can be used with convertMSFiles()
.convertMSFiles()
now has the centroid
argument to more easily perform centroiding.newProject()
:
withMSMS
filter for MS peak lists.importFeatures()
generic functionscore
column of MetFrag results stays correct.reportPDF()
/reportMD()
now report only 5 top most candidate compounds by default (controlled by compoundsTopMost
argument).plotSpec()
now displays subscripted formulaefilter()
methods for features
and featureGroups
. Please carefully read the updated documentation for these methods! (i.e. ?`filter,features-method`
and ?`filter,featureGroups-method`
).
featureGroups
method was adjusted, notably to improve reliability of blank filtration. Again, please see ?`filter,featureGroups-method`
.mzDefectRange
argument)maxReplicateIntRSD
argument)absMinFeatures
and relMinFeatures
arguments).preAbsMinIntensity
and preRelMinIntensity
arguments)newScript()
has been updated and supports more filter types.repetitions
argument is not needed anymore for the new algorithm and has been removed.Inf
values now should be used to specify no maximum for range filters (was -1
).annotatedPeakList()
method for formulas
and compounds
. Also used by reportMD
for improved annotation peak tables.maxRtMSWidth
and precursorMzWindow
)generateComponentsNontarget
, generateComponentsRAMClustR
, generateCompoundsSirius
, generateFormulasGenForm
, generateFormulasSirius
, generateMSPeakListsDA
, generateMSPeakListsMzR
, importFeatureGroupsBrukerPA
maxRtMSWidth
argument to generateMSPeakListsDA
, generateMSPeakListsMzR
(now maxMSRtWindow
) now specifies a retention time window ( +/- retention time feature) instead of total retention width around a feature. Hence, current input values should be halved.minSize
and relMinReplicates
(replaces ubiquitous
for CAMERA) arguments. Note that their defaults may filter out (feature groups from) components. See their documentation for more info.patRoon.path.metFragCL
to patRoon.path.MetFragCL
. The old name still works for backward compatability.?generateCompounds
.topMostFormulas
argument for SIRIUS compound generation.reportPDF()
/reportMD()
now report only 5 top most candidate formulae by default (controlled by formulasTopMost
argument).verifyDependencies()
function to let the user verify if external tools can be found.dirs
argument to convertMSFiles()
was slightly changed: if TRUE
(the default) the input can either be paths to analyses files or to directories containing the analyses files.featureGroups
method for plot()
.reportMD()
: Don’t plot Chord if <3 (non-empty) replicate groups are available.filter()
methods now support negation by negate
argument.reportMD()
: added table with annotated fragments for compounds/formulasconsensus()
updates
consensus()
methods now support extracting unique data. This also replaces the unique()
method that was defined for featureGroupsComparison
.comparison()
now automatically determines object names from algorithm (consistency with consensus()
method for other objects).plotVenn()
and plotUpSet()
methods to compare different compounds or formulas objects.filter()
method for components.MSMode="msms"
, now needs adduct
argument.adduct
class. This means that generateCompounds()
and generateFormulas()
now expect slightly differing arguments. Please see their manual pages.clearCache()
now supports removal of caches via regular expressions.topMost
and extraOpts
arguments for SIRIUS formula/compound generation.filter()
method for compounds now support generic scoring filtering and on elements of precursor and fragment formulae.?generateCompounds
for more details (notably the Scorings section).plotSpec()
pruneMissingPrecursorMS
option in ?generateMSPeakLists
).retainPrecursorMSMS
function arguments, see ?MSPeakLists
and ?generateMSPeakLists
).algorithm()
and as.data.table()/as.data.frame()
methods. The latter replaces and enhances the makeTable()
(formulas
class) and groupTable()
(featureGroups
class) methods.R
to C++
: significantly reduces time required for grouping large amount of features.revertDAAnalyses()
function: brings back set of Bruker analyses to their unprocessed state.doFMF
behaviour for DataAnalysis feature finding.formula
and formulaConsensus
classes are now merged: there is no need to call consensus()
anymore after generateFormulas()
.calculateFeatures=FALSE
). This can greatly speed up calulcation, especially with many analyses.filter()
and as.data.table()
/as.data.frame
methods bring new functionalities related to filtering, extracting data and performing several processing steps commonly performed for organic matter (OM) characterization.frag_neutral_formula
column). This ensures correct comparison when a consensus is made.reportCSV()
now splits formulas for each feature group in separate CSV files (similar to compounds
reporting).reportPDF()
now actually includes formula annotations in annotated compound spectra when formulas are specified.file
argument for clearCache()
generateMSPeakListsX
where X is the algo).generateCompounds()
and plotting functionality now uses averaged group peak lists instead of peak list of most intense analysis.plotSpec()
method for MSPeakLists: plot (non-annotated) MS and MS/MS spectra.maxRtMSWidth
argument used for peak list generation.maxRtMSWidth
argument for mzR peak list generation had no effect.addAllDAEICs()
function.mzWidth
argument of addDAEIC()
to mzWindow
.convertMSFiles
: changed interface with more options, parallelization and ProteoWizard support.getXcmsSet()
is renamed to getXCMSSet()
findFeatures()
/ groupFeatures()
nintersects
default for plotUpSet so that all intersections are plotted by default.features
class objects now store number of isotopes found for each feature.fGroups <- fGroups[, groupNames(compounds)]
kableExtra
package) that may cause memory leakage when reportMD()
is called repeatedly.