NEWS.md
    reAverage = TRUE was not handled correctly for the delete() method for MSPeakListsSet
report(): correctly handle removed suspect hits while reporting TP similarities (reported by Alessia Ore)checkFeatures() and checkComponents() to see whether a feature or component is marked to be removed (pull request #117 as suggested by Leon Saal)loadMSLibrary(): don’t always set Ion_mode of records to positive and guess missing Ion_mode data (issue #119)generateFormulasGenForm() topMost was not considered for cached resultsMSPeakLists
filter() method of formulas/compounds could in rare cases not be applied correctly for consensus and/or sets resultsfeaturesSets method for groupFeatures with (updated) featuresSets algorithm specific groupFeatures methods (groupFeaturesOpenMS etc), so that the latter can be called directly in sets workflows (#128)groupQualityRange filter would always remove all feature groups (reported by Geert Franken)generateTPsLibrary()/generateTPsLibraryFormula() any TPs that are equal to the parent and are from a generation>1 are now removedgenerateCompoundsSIRIUS() documentation that formula candidates without structure assignment are omitted (suggested by Nienke Meekel)removeTPIsomers filter for transformationProductsStructure didn’t actually apply the removeDuplicates filter.updateScores=TRUE for the methods addFormulaScoring(), predictRespFactors() and predictTox() then NaN scores could be introduced if the formulaScore is zero.updateScores=TRUE for the method addFormulaScoring() then the score would be updated twice.generateTPsLibrary()/generateTPsLibraryFormula() specifying >1 generation did not yield in additional TP searches if parents!=NULL
report() would error with components generated by CAMERA (reported by Nienke Meekel)as.data.table() method for featureGroups: normConcToTox argument was ignored (not for featureGroupsScreening)generateAnalysisInfo(): try to equalize the output and input directory orderdata.frame, features and featureGroups class as: getTICs(), getBPCs(), plotTICs() and plotBPCs().ion_formula_mz column instead of taking it from SIRIUS data, as it rarely may not be available (issue #111)MS2QuantMeta slotsWhen updating to this release, it is important to remove any cached data, i.e. by running clearCache("all") or manually removing the cache.sqlite file from your project directory.
predictRespFactors() could incorrectly cache results and perform concentration conversions twice for calibrants under some circumstances (reported by Drew Szabo)verifyDependencies() could throw errors when external dependencies were not found.predictRespFactors()/predictTox(): better handle objects without resultsMS2Quant is now stored in the MS2QuantMeta slots (suggested by Drew Szabo)calculateTox()/calculateConcs(): only consider relevant feature annotationscalculateConcs(): avoid warnings when there are no feature groupsas.data.table() methods for featureGroups/featureGroupsScreening:
features==TRUE and collapseSuspects=NULL
features==TRUE and/or collapseSuspects=NULL
replicate_groups with incorrect data was included with features==TRUE and average==TRUE
features==TRUE and average==TRUE
collapseSuspects argument for as.data.table() method for suspect screening resultsThis release adds significant new functionality, several important changes and several bug fixes thanks to user feedback.
Users of previous patRoon versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is important to remove any cached data, i.e. by running clearCache("all") or manually removing the cache.sqlite file from your project directory.
This release adds new way to install and update patRoon and its dependencies. The most important changes are
patRoon bundles: these are standalone installations of R, patRoon and its R package dependencies, and all other external dependencies such as MetFrag, OpenJDK, OpenMS etc. This is primarily useful for users not familiar to R or wanting to quickly try a new patRoon release.patRoon and its dependencies automatically, which prevents the need to manually install packages from different sources (BioConductor, GitHub etc). Furthermore, this package also installs patRoonExt, another axuliary package that bundles most external dependencies (e.g. MetFrag, PubChemLite, OpenMS).patRoon_install script is now replaced by the above installation methids, and is therefore considered deprecated, will be removed in the future and should therefore not be used anymore.For more information, please read the updated installation chapter in the handbook, and see the project pages of patRoonInst and patRoonExt.
IMPORTANT If you installed
patRoonvia the legacy installation script, please read the installation chapter to disable/remove this installation prior to updatingpatRoon!
The second milestone of this release is the integration of the MS2Tox and MS2Quant R packages, which support machine learning approaches to predict the toxicity and concentration of features. The integration adds the following functionality to patRoon:
SMILES.as.data.table() function and reporting interface were updated to inspect the predicted toxicities/concentrations.Please see the relevant section in the handbook, and the project pages of MS2Tox and MS2Quant for more details.
loadMSLibrary(): improve compatibility with more .msp flavors (issue #72).newProject(): save/load parameters to reproduce subsequent project creations (issue #61)groupFeaturesOpenMS(): now supports handling large numbers of analyses on Windows (reported by Geert Franken, fix thanks to https://github.com/OpenMS/OpenMS/issues/6845).patRoon.checkCentroided to control whether analyses files are verified to be centroided (suggested by Geert Franken).overlap(): the which param can now also be a list to compare groups of replicate groups (similar to plotVenn())consensus() method for featureGroups: new verifyAnaInfo flag to optionally skip if the analysis information are equal for all compared objects. This is mainly useful when the data is the same but in different formats.genReportSettingsFile(): baseFrom argument to update old report settings files.generateFormulasSIRIUS(): new getFingerprints and token arguments to download CSI:FingerID fingerprints for formula candidates. This was primarily implemented to support calculating toxicities/concentrations from formula annotations.report() now correctly handles SIRIUS compounds results and suspects without SMILESbslib 0.5.0 (reported by Alessia Ore)annotatedBy filter for MSPeakLists could remove precursor peaks in MS/MS data regardless if retainPrecursorMSMS=TRUE
report(): The suspect(s) column for compound annotation results was always emptyreAverage argument was ignored by the filter() method of MSPeakLists when checking if cached data is available (issue #87)reAverage=TRUE to the filter() method of MSPeakLists then the peak IDs were not regenerated (issue #87)plotSpectrum() method for formulas/compounds didn’t expand plot height for formula annotations with only one mass peakannotatedPeakList()/plotSpectrum() methods for compounds didn’t label mass peaks with compounds algorithm if formulas were provided but no formula candidate was present.generateFormulas() generic definition had wrong argument ordergetSIRIUSToken() resulted in errors if the password input was cancelled.This release adds significant new functionality, several important changes and many bug fixes thanks to user feedback.
Users of previous patRoon versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is important to remove any cached data, i.e. by running clearCache("all") or manually removing the cache.sqlite file from your project directory.
The most significant change in this release is the addition of redesigned reporting functionality. Some key functionality and changes:
html interface was completely redesigned to provide a modern, responsive and easier to use interface, which is powered by the bslib and reactable R packages.SVG vector graphics, which are generally smaller in size, faster to create and can be zoomed in without loss of quality.YAML file.An example can be seen from the report output of the tutorial.
The new reporting interface is used with the new report() method function. All the documentation was updated to reflect these changes. The now ‘legacy’ report interface (reportCSV(), reportPDF() and reportHTML()) is still available for backwards compatibility and may still be of interest as the new interface currently only supports the HTML format.
The new reporting functionality obviously did not yet underwent years of usage and feedback. Hence, please report any bugs and suggestions you may have!
Active logins are now necessary to use webservices such as CSI:FingerID, see e.g. https://boecker-lab.github.io/docs.sirius.github.io/account-and-license/ This release of patRoon adds support to make logging in more easy and adds several compatibility fixes for the latest SIRIUS version. The new utility function getSIRIUSToken() can be used to obtain a necessary login token. The new token argument for generateCompoundsSIRIUS() can be used to automatically log in. The newPorject() function was extended to use this new functionality.
The Docker images are now served by the GitLab server of the University of Amsterdam. To pull the latest images you can run the following command:
docker pull uva-hva.gitlab.host:4567/r.helmus/patroon/patroonrsThe changes are reflected in the installation section of the handbook.
A new algorithm for generateTPs() was added: library_formula. This algorithm is similar to the library algorithm, but only works with chemical formulae. This is especially useful if only formula data is available for parents and/or TPs. The genFormulaTPLibrary() utility function can be used to automatically generate a formula library from given transformation rules. More information can be found in the updated handbook and reference manual (?generateTPsLibraryFormula).
Other important changes include:
topMost and onlyPresent, are now combined in a parameter list. The parameter list is specified with the new EICParams argument to functions such as plotChroms() and report(). A list with default parameter values is generated with the getDefEICParams() function. More information can be found in the reference manual: ?EICParams.plotChroms() method for featureGroups was changed.makeSet() method for featureGroups (and related functions adducts() and selectIons()): the original set specific feature groups are now combined to create the final feature groups, instead of grouping features from all sets at once. This prevents rare cases where features with different adduct assignments in the same set would be grouped together (i.e. if their neutral mass would be the same). Note that this change probably will produce slightly different results. This change required the addition of a new slot annotationsChanged to featureGroupsSet for internal usage by the adducts()<- method.setAvgSpecificScores arguments of generateFormulas()/generateCompounds() to TRUE.neutralChemProps/neutralizeTPs arguments. Whether a structure was neutralized is marked by the new molNeutralized column.
neutralizeTPs is set and a neutralization of a TP results in a duplicate structure (i.e. in case the algorithm also generated the neutral form of the TP) then the neutralized TP is removed.newProject(): added possibility to exclude analyses out of folder (issue #60, #63)as.data.table() for featureGroups
regression=TRUE: add column with p valuesfeatures=TRUE: add replicate group columnplotChroms(): analysis, groupName and intMax argumentsdelete() method function was added to modify MS peak liststhrMS, thrMSMS, thrComb and maxCandidates arguments, which can be used to tweak calculations for features with many candidates, e.g., to limit calculation times.and keyword in the YAML configuration file. This is especially useful when combined with the or keyword.adduct_abundance columnplotInt() method for components: index argument can now also be component namegenerateTPsCTS(): support new PFAS libraries (set "pfas_environmental" or "pfas_metabolism" as the transLibrary argument).generateTPsLibrary(): the matchParentsBy argument now also accepts "formula" and "name".retDir column that specifies the retention time direction of the TP compared to its parent (alternative to specifying log P values).matchGenerationsBy to the library (and library_formula) algorithm for generateTPs(), which controls how parents/TPs are matched when searching multiple transformation generations.maxExpGenerations argument to generateTPsBiotransformer to avoid excessive TP hierarchy expansions.generateTPsCTS(): support new PFAS libraries (set "pfas_environmental" or "pfas_metabolism" as the transLibrary argument).generateComponentsTPs() the formulaDiff column now splits elemental losses and gains, similar as the plotGraph() method already did for TPs.delete() method function was added to modify transformationProducts
plotInt() methods: plotArgs and linesArgs to pass additional arguments to plot()/lines(). The latter replaces the dots argument.plotGraph() methods
width and height arguments.transformationProductsStructure now draw structures in SVG format to improve qualityclearCache(): vacuum option to speed up clearing large cache files.as.data.table() for featureGroups with regression=TRUE: treat missing features as NA
plotChord() method for featureGroups: significantly optimized some old codeplotChroms() / EIC loading
plotScores()
plotSpectrum() for sets workflows better handles missing data from one or more sets when making a comparison, which avoids empty plots in such cases.annotatedPeakLists(), especially with sets workflows.specSimParams argument replaces the relMinMSMSIntensity and simMSMSMethod arguments.annotateSuspects(): log if the suspect formula/compound data could not be matched with feature annotationsformulaDiff column in TP component results was changed to more easily identify elemental losses/gains.fromTPs slot was added to TP components and is TRUE if a transformationProducts object was used during componentization. This is primarily for internal use.plotGraph() methods: show empty plot instead of throwing an error if results are emptyprefCalcChemProps=FALSE
reportHTML()
EICOnlyPresent argument to reportHTML() is effective againreportHTML() could show plots of wrong TP resultsnewProject()
newProject() for sets mode used c() instead of list() to specify positive+negative suspect lists to screenSuspects()
rstudioapi::getSourceEditorContext() (issue #62)newProject() used wrong variable name for suspect list under some conditions (issue #69)analysis.csv already exists when needednorm_conc field for analysis information was ignored (reported by Geert Franken)checkFeatures()/checkComponents(): disabling a feature/featureGroup in a sorted table would lead to wrong selectionsselectIons() does not throw an error anymore if there is no suitable adduct/isotope information in the given components, which would result in incorrect behavior with sets mode if e.g. no annotations were found for one of the sets.predictCheckFeaturesSession() marked passing peaks to be removed instead of the other way around (issue #59)selectIons() didn’t properly handle empty components objectschromPeaks() from xcms was sometimes not found (issue #68)calculatePeakQualities() would throw errors for empty feature results (reported by Louise Malm)plotChroms() / EIC loading: group rectangle with topMost set didn’t consider retention times and intensities from other featurestraceSNRFiltering argument could not be set for findFeaturesOpenMS()
plotChroms() now better supports plotting chromatograms for analysis without feature data (ie when onlyPresent=FALSE) in sets workflows. This now correctly works for cases where a feature is completely absent in a set.filter() method for MSPeakLists where precursor isolation (isolatePrec argument) also applied to MS/MS data (issue #56).generateCompoundsMetFrag() didn’t properly detect changes in local database files when considering cached data.MSPeakLists without any results could lead to errors.scoreTypes slot could contain scorings not actually used, e.g. if the scoreTypes argument to generateCompoundsMetFrag() contained scorings not actually present in the used database.addFormulaScoring(): updateScore argument was ignored and treated always as TRUE
annotateSuspects() (issue #54). SuspectshigherThanNext setting from estimated ID level 4ascreenSuspects() would fail if the adduct column contains partially NA data.unset() for featureGroupsScreeningSet resulted in loss of group quality scores and internal standard assignmentsannotateSuspects()
generateCompoundsLibrary() if the library did not contain peak formula annotations.fragments_formula was not split per set in the screening results.convertMSFiles() if the analysisInfo argument is set and outPath is set with a length >1 then the wrong output path could be used.This release extends version 2.0 with new functionality, several important changes and bug fixes. The newProject() function was updated for the new functionality. Please see the updated Handbook and sections below for more information.
Users of previous patRoon versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is highly recommended to remove any cached data, i.e. by running clearCache("all") or manually removing the cache.sqlite file from your project directory.
generateTPs() function now supports an additional algorithm that interfaces with the Chemical Transformation Simulator (CTS). An important advantage of this algorithm is that it supports several abiotic transformation pathway libraries.plotGraph() generic function. Furthermore, this function can incorporate componentization results to easily display which TPs are present in the screening results.transformationProductsStructure, is now used to store results for algorithms that provide structural information (biotransformer, library and cts). This better harmonizes the functionality between algorithms (e.g. with the filter() method function).plotVenn(), plotUpSet() and consensus() methods are now available to compare and combine TP data.convertToMFDB()/generateComponentsTPs() don’t include any columns anymore that are specific to the transformation pathway.?generateTPsBiotransformer()) for details.patRoon. A new method function, normInts() now supports various normalization methods, such as normalization by internal standards and the TIC. With internal standard normalization, the plotGraph() function can be used to interactively evaluate which internal standards were automatically assigned to each feature group.normInts() function now handles all normalization and stores normalized intensities/areas in the feature data.as.data.table(), plotInt() etc) now have a new normalized argument, which should be TRUE to use normalized data. The normFunc argument to these functions was removed since it is not necessary anymore.normalized=TRUE and normInts() was not called on the feature data, a simple automatic default normalization is done. This is primarily for backwards compatibility.as.data.table() can now report normalized values for averaged feature data (if (features && average && normalized) == TRUE)removeISTDs argument for filter() to remove feature groups that are assigned as internal standards.norm_conc) that influences normalization calculations. The generateAnalysisInfo() function can now initialize this data.ISTDs and ISTDAssignments slots and their accessor methods internalStandards() and internalStandardAssignments() to store/access the internal standard assignment data.libMatch).SMILES, InChI and formulas for e.g. suspect lists was significantly changed. More data is now verified, and several optimizations were implemented to better handle large suspect lists or MS libraries. Note that minor changes in neutralMass values may be observed. For more details please see the reference manual (e.g. ?screenSuspects).filter() method was defined for the transformationProducts class to filter generic properties.calcSims argument to the generateTPs functions: if TRUE then structural similarities will be calculated between parents/TPs.library algorithm now caches its results and supports multiple transformation generations (generations argument).reportHTML()
TPs argument).generateFormulasSIRIUS()/generateCompoundsSIRIUS(): projectPath and dryRun arguments. These are mainly for internal use.getEICs() utility to obtain raw EIC data (suggested by Ricardo Cunha).biotransformer
calcSims argument to TRUE (see above).parent_ columns (parent_SMILES, etc)steps argument was renamed to generations for consistency with other algorithms.library algorithm: naming of TPs is similarly done as other algorithms. The library TP names are now stored in the name_lib column.onlyLinked argument from the plotGraph() generic. This was done as the new plotGraph() methods don’t support this argument. Note that the argument was only removed from the generic, the original plotGraph() methods still support the argument.generateCompoundsSIRIUS(): removed unused errorRetries argumenttopMost filter applied during MS peak list averagingconvertMFDB() now always collapses duplicates, not just for biotransformer results.biotransformer: retDir is now derived from the original parent, i.e. not its direct parentreportHTML() now properly subscripts negative element counts in formulasreportHTML() improve handling of missing or split compound identifiers when generating URL linkscompounds: avoid _unset suffixes in mergedBy column from data of sets workflowsnewProject(): loading analysis info from CSV now works again on WindowsNA exit codes on Linux systemsgenerateCompoundsSIRIUS(): topMost argument was used where topMostFormulas was supposed to be usedas.data.table() method for featureAnnotations would throw an error for empty results with OM=TRUE
removeBlanks feature groups filter would not handle analyses with multiple blanks.enviPick optional dependency and added instructions to install from GitHub, as it was removed from CRAN.newProject() would not add suspect annotation to the output script if the example suspect list or sets workflows were chosen.min_width was incorrect (PR #31, thanks to @@coltonlloyd)installPatRoon() improvements in determining what is already installedfeatureGroups objects after calling screenSuspects() or unset()
generateAnalysisInfo()) is now case insensitive (see issue #34 and #43)newProject(): reportCSV() call in generated script included non-existing MSPeakLists argument.adduct argument was specified (Corey Griffith)findFeatures, findFeaturesKPIC2) are now documented on separate pages.featureGroups are now documented in a separate page (?feature-plotting)findFeaturesKPIC2() and importFeaturesKPIC2() now have correct casing (was lower case ‘f’)convertMSFiles() were not properly verifiedanalysis-information (issue #33)featureGroups was not updated when removing groups with delete() (except sets workflows)nontarget an optional dependency and install it from GitHub with CI and in the installation docs (see issue #48)newProject() ignored group/blank input for sets workflowsmz/rt columns are not numericcheckFeatures(): don’t show multiple rows if a suspect was matched with multiple feature groups. This change removed the option to show specific suspect columns.checkFeatures() errored if Plot mode was ‘Top most replicates’ or ‘All’topMost plotting of EICs for sets dataplotChroms(): peak area filling (showPeakAreas=TRUE) didn’t work if the peak height exceeded ylim
screenSuspects() with sets workflows: don’t warn about set specific suspect data if all data is NAcheckFeatures()/checkComponents() now cleanup unavailable selections when saving the sessionreportHTML(): Don’t try to report TP components if no data is availableformulasSet method for plotSpectrum(): don’t try to plot a comparison plot for candidates without MS/MS datareportHTML() don’t try to plot a comparison plot for formula candidates without MS/MS dataThis release adds a significant amount of new functionality and changes. Please see the updated Handbook and sections below for more information.
Users of previous patRoon versions should inform themselves with the important changes highlighted in the next section. Furthermore, it is highly recommended to remove any cached data, i.e. by running clearCache("all") or manually removing the cache.sqlite file from your project directory.
exportedData to loadRawData.mzWindow and EICMzWindow arguments were renamed to mzExpWindow / EICMzExpWindow and are now with slightly different meaning (please see the reference manual).minFWHM/maxFWHM defaults lowered for findFeatures and feature optimization.useGGPlot2 argument) is removed (not often used and a maintenance burden).precursor argument to the plotSpectrum(), annotatedSpectrum() and plotScores() methods for formulas now expects the neutral formula instead of the ionized formula. This change was done for consistency with compound annotations and sets workflows.featThreshold.featThresholdAnn, only takes annotated features into account.featThreshold is now 0, for featThresholdAnn it is the same as the previous default for featThreshold.analysis column to analysis_from and added analyses column that lists all analyses from which the consensus was made.formulas and compounds classes now derive from the new featureAnnotations class. Most of the functionality common to formulas/compounds are defined for this class.maxFormulas/maxFragFormulas argument for as.data.table() were removed, as these don’t make much sense with the new format.elements filter now applies to neutral formulae for both formula and compound annotations (fragElements still applies to the ionized fragment formula).MSPeakLists. Since all algorithms now require peak lists, generateFormulas now has a mandatory MSPeakLists argument (similar to generateCompounds).ion_formula (ionized) and neutral_formula (neutral) columns. Similarly, the formula_mz column was renamed to ion_formula_mz.The most important new functionality in patRoon 2.0 are transformation product (TP) screening workflows. This release adds functionality to predict TPs (with BioTransformer or metabolic logic) or search TPs in PubChem or custom databases. Furthermore, other data such as MS/MS similarity or feature classification data can be used to relate parent/TP features. Other TP screening functionality includes TP prioritization and automatic generation of TP compound database for MetFrag annotation. The workflows follow the classical design of patRoon, where flexible workflows can be executed with a combination of established algorithms and new functionality. For more information, please see the dedicated chapter about TP screening in the Handbook.
Another major change in this release is the addition of sets workflows. These workflows are typically used to simultaneously process positive and negative ionization data. Advantages of sets workflows include simplification of data processing, combining positive and negative data to improve e.g. feature annotations and easily comparing features across polarities. A sets workflow requires minimal changes to a ‘classical workflow’, and most of the additional work needed to process both polarities is done automatically behind the scenes. For more information, please see the dedicated chapter about sets workflows in the Handbook.
The following new feature detection/grouping algorithms were integrated: SIRIUS, KPIC2 and SAFD. Furthermore, integration with MetaClean was implemented for the calculation of peak qualities and machine learning based classification of pass/fail peaks. In addition, the peak qualities are used to calculate peak scores, which can be used for quick assessment and prioritization.
Interactive curation of feature data with checkChromatograms() was replaced with checkFeatures(), which is much faster, is better suitable for larger datasets, customizable and has an improved user interface. Furthermore, this tool can be used for training/assessing MetaClean models. Similarly, checkComponents() is a function that allows interactive curation of component data.
The delete() generic function allows straightforward deletion of parts of the workflow data, such as features, components and annotations. Furthermore, this function makes it easy to implement customized filters.
The algorithms of OpenMS (MetaboliteAdductDecharger) and cliqueMS were integrated for additional ways to detect adducts/isotopes through componentization. Furthermore, the new selectIons() method uses these annotations to prioritize features (e.g. by only retaining those with preferable adducts). In addition, this function stores the adduct annotations for the remaining feature groups, which can then be automatically used for e.g. formula and compound annotation.
newProject
as.data.table()
normFunc argument)averageFunc argument)FCParams argument)qualities argument)plotVolcano() method function to plot fold changes.topMostByRGroup/EICTopMostByRGroup arguments for plotting/reporting EIC data of only the top most intense replicate(s).reportHTML() now only plots the EIC of the most intense feature of each replicate group (i.e. EICTopMostByRGroup=TRUE and EICTopMost=1).XCMS3
loadRawData argument for feature grouping and comparison()
... argument for findFeaturesXCMS3
preGroupParam to specify grouping parameters used prior to RT alignment (suggested by Ricardo Cunha)XCMS feature (group) objects are synchronized as much as possible when feature data is changed.show() and filter() methods.OpenMS feature finding: useFFMIntensities argument to speed up intensity loading (experimental).reportHTML() now reports general feature information in a separate tab.results argument to [ (subset) and filter() to quickly synchronize feature groups between objects (e.g. to quickly remove feature groups without annotation results).plotSpectrum() to automatically calculate the space necessary for formula annotation texts and candidate structures was improved. Annotation texts are now automatically resized if there is insufficient space, and the maximum size and resolution for candidate structures can be controlled with the maxMolSize/molRes parameters.filter() method for MSPeakLists: minMSMSPeaks filter to only retain MSMS peak lists with a minimum number of peaks.filter() method for MSPeakLists: annotatedBy filter to only keep peaks with formula/compound annotations.screenSuspects() method now supports the amend argument, which allows combining results of different screenSuspects() calls (see the Handbook for details).specclust, which generates components based on hierarchically clustering feature groups with high MS/MS similarities.groupFeatures: renamed the feat argument to obj.reportHTML(): EICs are shared between tabs to avoid duplicated plottingfeatures object embedded in featureGroups objects is now synchronized, and any features not present in any group are removed accordingly. This reduces memory usage and indirectly causes reportCSV() to only report features still present.plotInt(): now has xnames and showLegend arguments to adjust plotting.[ (subset) and filter() methods for MSPeakLists now only re-average peak lists if the new reAverage argument is set to TRUE (default FALSE). This change was mainly done as (1) the effects are usually minor and (2) re-averaging invalidates any formula/compound annotations done prior to filtering.filter() method for MSPeakLists: the withMSMS filter is now applied after all other filters.MetFrag: the raw unprocessed annotation formulas are now additionally stored in the fragInfo tables (formula_MF column).MetFrag: the precursor ion m/z is now taken from peak list data instead of the feature group to improve annotation.MetFrag: the useSmiles parameter is now set to true as this seems to improve results sometimes.as.data.table() method for formulas: if average=TRUE then all column data that cannot be reasonably averaged are excluded.annotatedPeakList(): also add annotation columns for missing results (for consistency).minMaxNormalization argument to the consensus() method for compounds was removed (unused).filter() for formulas/compounds: if algorithm consensus results are filtered with scoreLimits, and a score term exists multiple times for a candidate, only one of the terms needs to fall within the specified limits for the candidate to be kept (was all).plotSpectrum() for compounds: plotStruct is now defaulted to FALSE.MSPeakLists data now store an unique identifier for each mass peak in the ID column. These IDs are used by e.g. formula/compound annotations, and stored in the PLID column in their fragInfo data. This replaces the PLIndex column in fragInfo data, which was only row based, and therefore invalidated in case peak lists were filtered afterwards.GenFormAdducts() and MetFragAdducts() now additionally return adducts in generic format and use cached data for efficiency.err argument to as.character() to control if an error or NA should returned if conversion fails.as.adduct() now removes any whitespace and performs stricter format checks to make conversion more robust.MetaboliteAdductDecharger) and cliqueMS.calculateIonFormula() and calculateNeutralFormula() now Hill sort their resultas.data.table(): Suspect screening specific columns are now prefixed with susp_.suspFormRank and suspCompRank suspect annotation data columns were renamed to formRank/suspCompRank (the previous change made prefixing unnecessary).logPath argument for annotateSuspects() to specify the file path for log files are disable logging completely.fastcluster for hierarchical clustering.rt to ret for consistency.show(): show unique feature group counts.filter(): allow negative rtIncrement values.nontarget: replaced extraOpts argument with ....nontarget: store links as character string indices instead of numeric indices.RAMClustR: moved position of ionization argument to improve consistency.componentsReduced class) when filtering/subsetting components, was removed. This system was quite unintuitive and imposed unnecessary limitations. Instead, functions that cannot work after component data is changed (e.g. those specific to intensity clustering) will throw an error if needed.intclust components are now derived from a general componentsClust class, which is shared with specclust components. The common functionality for both algorithms is implemented for this class.show() methods now print class inheritance treeprogressr package is not used anymore, thus, it is not necessary to set up progress bars with future based multiprocessing.newProject: Moved order of componentization step (now before annotation & suspect screening).xlim/ylim was used with plotChroms then peaks were not always correctly filledretMin argument to plot() method for featureGroupsComparison wasn’t properly used/defaulted.plotSpectrum() if xlim is set and this yields no data then an empty plot is shown.plotSpectrum() automatic ylim determination was incorrect if only one peak is shown.scoreLimits filter for formulas could ignore results not obtained with MS/MS data.as.data.table(compounds, fragments=TRUE) returned empty results for candidates without fragment annotations.topX arguments for the MSPeakLists method for filter() would re-order peak lists, thereby invaliding any annotations.SIRIUS ‘adduct fragments’.generateMSPeakListsDAFMF() potentially used wrong DA compound data in case features were filtered.numericIDLevel() now properly handles NA values.importFeatureGroupsBrukerTASQ(): Improved handling of absent analyses in imported results files.[M-H]-), resulting in ~1.5 mDa deviations[2M+H]+) and multiple charges (e.g. [M+2H]2+)RAMClustR: ensure that columns are the right type if all values are NA.CAMERA: correctly handle cases when minSize filter results in zero components.plotGraph(): improve error handling with empty objects.newProject(): correctly handle DIA with Bruker MS peak lists.peakgroups alignment method was used (fixes issue #22)mapply warning was shown with newProject()
newProject(): don’t show Remove button in analyses select screen when the script option is selected, as this will not work properly.IPO: add default limits for OpenMS traceTermOutliers
IPO optimization fix: integer parameters are properly roundedgenerateFeatureOptPSet("xcms3", method="matchedFilter") would return a parameter set with step instead of binSize (issue #23)newProject() would generate an ID levels configuration file even when no suspect list was selected.NULL values that may occasionally be returned by rcdk::get.mcs
reportHTML() annotation table was paged.analysisInfo validity may result in an error (reported by Tiago Sobreira)convertMSFiles() error with dirs=TRUE (reported by Tiago Sobreira)installPatRoon()
screenSuspects() did not take onlyHits into account for cachingscreenSuspects(): The original suspect name is stored in the name_orig columnXCMS3 feature group optimization: binSize and minFraction values were rounded while they shouldn’t (issue #27)This releases focuses on a significantly changed suspect screening interface, which brings several utilities to assist suspect annotation, prioritization, mixing of suspect and full NTA workflows and other general improvements.
IMPORTANT: The suspect screening interface has changed significantly. Please read the documentation (?screenSuspects and the handbook) for more details. If you want to quickly update your code without using any new functionality:
Change your existing code, e.g.
scr <- screenSuspects(fGroups, suspectList, ...)
fGroupsScr <- groupFeaturesScreening(fGroups, scr)to
fGroupsScr <- screenSuspects(fGroups, suspectList, ..., onlyHits = TRUE)Major changes
onlyHits=TRUE). This allows straightforward mixing of suspect and full non-target workflows.screenInfo() or as.data.table() methods.suspects argument to [, e.g. fGroupsScr[, suspects = "carbamazepine"]
annotateSuspects(), allows combining the annotation workflow data (peak lists, formulas, compounds) to perform a detailed annotation for the suspects found during the workflow. This method calculates properties such as
filter() method for suspect screening results, which allows you to easily prioritize data, for instance, by selecting minimum annotation ranks and similarities, identification levels and automatically choosing the best match in case multiple suspects are assigned to one feature (and vice versa).as.data.table() method and reporting functionality for suspect screening results to quickly inspect their annotation data.?screenSuspects and ?annotateSuspects for more information.screenInfo()).m/z), these can be included in the suspect list to improve suspect annotation.features objects was removed. The same and much more functionality can be obtained by the workflow for feature groups.reportCSV() function was simplified and uses as.data.table() to generate the CSV data. This should give more consistent results.individualMoNAScore MetFrag scoring is now enabled by default.Other changes
reportHTML() now allows toggling visibility for the columns shown in the feature annotation table.plotVenn() method for featureGroups now allows to compare combinations of multiple replicate groups with each other. See ?plotVenn for more information.SIRIUS binary on macOS did not work properlyGenForm resulted in an error (https://github.com/rickhelmus/patRoon/issues/18)plotEIC(), groups() and plotSpec() methods were renamed to plotChroms(), groupTable() and plotSpectrum(). This was done to avoid name clashes with XCMS and CAMERA. The old functions still work (with a warning), but please update your scripts as these will be removed in the future.patRoon now supports an additional method to perform parallelization for tools such as MetFrag, SIRIUS etc. The main purpose of this method is to allow you to perform such calculations on external computer clusters. Please see the updated parallelization section in the handbook for more details.logPath and maxProcAmount arguments to functions such generateFormulas, generateCompounds etc were removed. These should now solely be configured through package options (see ?patRoon).patRoon.maxProcAmount package option was renamed to patRoon.MP.maxProcs.SIRIUS
calculateFeatures=TRUE would try to calculate formulas for features even if not present (eg after being removed by subsetting or filtering steps).SIRBatchSize argument to formula and compound generation functions was renamed to splitBatches and its meaning has slightly changed. Please see the reference manual (e.g. ?generateFormulas) for more details.generateCompoundsMetfrag was renamed to generateCompoundsMetFrag.withOpt() to temporarily change (patRoon) package options.printPackageOpts(): display current package options of patRoon.OpenMS: potentially large temporary files are removed when possible to avoid clogging up disk space (especially relevant on some Linux systems where /tmp is small).XCMS are not attached by default, which significantly speeds up loading patRoon (e.g. with library()).compoundViewer() function was marked as defunct, as it hasn;t been working for some time and its functionality is largely replaced by reportHTML().generateComponentsNontarget(): update homolog statistics for merged series.checkChromatograms(): fix error when fGroups has only one replicate groupconvertMSFiles(): If algorithm="pwiz" and vendor centroiding is used then any extra filters are now correctly put after the peakPicking filter.getXCMSnExp() is now properly exported and documented.annoTypeCount score for annotated compounds with PubChemLite is now not normalized by default anymore when reporting results.reportHTML() now correctly handles relative paths while opening the final report in a browser.componentsNT: include algorithm data returned by nontarget::homol.search in homol slot (suggested by Vittorio Albergamo)convertMSFiles() fixes (issue #14)
cwt option is now available for conversion with ProteoWizardfeatures objectsgenerateCompoundsMetFrag(): compound names could be sometimes be interpreted as dates (reported by Corey Griffith)SIRIUS annotation didn’t use set adduct but used default insteadSIRIUS results are better handled if choosen adduct is not [M+H]+ or [M+H]+
data.table objects properly from cache.plotGraph() didn’t properly handle components without linked series (reported by Vittorio Albergamo)sn column) (suggested by Ricardo Cunha)exportedData/verbose to getXCMSSet() functions to avoid ambiguitiesgenerateComponentsNontarget(): allow wider m/z deviation for proper linkage of different series (controlled by absMzDevLink argument).addAllDAEICs() sometimes used wrong names for EICsreportPDF() may report formula annotated spectra of results not present in input featureGroups
data.table data from cache now calls data.table::setalloccol() to ensure proper behavior if data.table::set() is called on cached data.compounds with useGGPlot2=TRUE would try to plot formulas for non-annotated peaks (resulting in many extra diagonal lines)reportPDF() where not properly placed in a grid (as specified by EICGrid argument)reportHTML()
retMin=TRUE
newProject() didn’t show polarity selection if only a compound identification algorithm was selected.groupFeaturesXCMS3() didn’t properly cache results.MSPeakLists: results for averaged peak lists are now the same order as the input feature groupsSIRIUS support
SIRIUS (configurable with new SIRBatchSize function argument). This dramatically improves overal calculation times (thanks to Markus Fleischauer for pointing out this possibility!).generateFormulasSirius() and generateCompoundsSirius() are now properly capitalized to generateFormulasSIRIUS() and generateCompoundsSIRIUS()
SIRIUS 4.4.SIRIUS output is directly shown on the console.SIRIUS can be specified with the cores function arguments.SIRIUS
groupNames(), analyses() and similar methods sometimes returned NULL instead of an empty character vector for empty objects.plotHeatMap() with interactive=TRUE: switch from now removed d3heatmap package to heatmaply
reportHTML() didn’t split PubChem URLs when multiple identifiers were reported.PWizBatchSize argument for convertMSFiles()
extraOptsRT/extraOptsGroup arguments for OpenMS feature grouping to allow custom command line options.importFeatureGroupsBrukerTASQ
plot() method for featureGroups now allows drawing legends when colourBy="fGroups" and sets colourBy="none" by default, both for consistency with plotEIC().newProject() now uses XCMS3 algorithms instead of the older XCMS interface.xcms (not xcms3) could not be subset with zero analyses (which resulted in errors by e.g. unique() and reportHTML()). Reported by Corey Griffith.formulas/compounds objectsnewProject() dialogaddTrivialNames option as it never worked very well.reportHTML(): only components with reported feature groups are now reported.m/z values. Instead, suspect lists can contain SMILES, InChI or neutral mass values which are used for automatic ion m/z calculation. See ?screenSuspects for more details.consensus()
newProject() UI only showed partial amount of rows.addFormulaScoring() function now uses a different algorithm to calculate formula scores for compound candidates. The score is now based on the actual formula ranking in the provided formulas object, and is fixed between zero (no match) and one (best match).convertMSFiles correctly checks if input existsmaxProcAmount (i.e. number of parallel processes) now defaults to amount of physical cores instead of total number of CPU threads.batchSize to 8 for GenForm formula calculation.plot() for featureGroups can now highlight unique/shared features across replicates (suggested by V Albergamo)plotGraph()
concs option for generateAnalysisInfo() to set concentration datafeatureGroupsComparison can be customized (useful for e.g. plotting)%>%)topMost argument for GenForm formula calculation.ref to blank. Similarly, the refs argument to generateAnalysisInfo() is now called blanks.reportMD() is renamed to reportHTML()
filter() method for formulas: minExplainedFragPeaks is now called minExplainedPeaks
screenTargets and its targets parameter have been renamed to screenSuspects() / suspects
groups() and as.data.table() methods for featureGroups: optionally consider feature areas instead of peak intensities.plotSilhouettes() method for compoundsCluster
rGroups argument to subset operator for featureGroups to subset by replicate groups (equivalent to rGroups argument to filter()).GenForm formula calculation with MSMode="both" (the default): instead of repeating calculations with and without MS/MS data and combining the data, it now simply does either of the two depending on MS/MS data availability. The old behavior turned out to be redundant, hence, calculation is now a bit faster.GenForm now perform precursor isolation to cleanup MS1 data prior to formula calculation. During this step any mass peaks that are unlikely part of the isotopic pattern of the feature are removed, which otherwise would penalize the isotopic scoring. The result is that isotopic scoring is dramatically improved now. This filter step is part of new filter functionality for MSPeakLists, see ?MSPeakLists and ?generateFormulas for more information.?formulas).consensus() (absMinAbundance and relMinAbundance)MetFrag: for-ident database and new statistical scores are now supportedas.data.table() / as.data.frame() for featureGroups now optionally reports regression information, which may be useful for quantitative purposes. This replaces the (defunct) regression() method and limited support from screenTargets().plotGraph() method to visually inspect linked homologous series.newProject() (e.g. loading of example data).reportMD(): most time consuming plots are now cached. Hence, re-reporting should be signficiantly faster now.convertMSFiles() now (optionally) takes analysis information (anaInfo) for file input.convertMSFiles() now supports Bruker DataAnalysis as conversion algorithm (replaces now deprecated exportDAFiles() function).MSFileFormats() function to list supported input conversion formats.generateAnalysisInfo() now recognizes more file formats. This is mainly useful so its output can be used with convertMSFiles().convertMSFiles() now has the centroid argument to more easily perform centroiding.newProject():
withMSMS filter for MS peak lists.importFeatures() generic functionscore column of MetFrag results stays correct.reportPDF()/reportMD() now report only 5 top most candidate compounds by default (controlled by compoundsTopMost argument).plotSpec() now displays subscripted formulaefilter() methods for features and featureGroups. Please carefully read the updated documentation for these methods! (i.e. ?`filter,features-method` and ?`filter,featureGroups-method`).
featureGroups method was adjusted, notably to improve reliability of blank filtration. Again, please see ?`filter,featureGroups-method`.mzDefectRange argument)maxReplicateIntRSD argument)absMinFeatures and relMinFeatures arguments).preAbsMinIntensity and preRelMinIntensity arguments)newScript() has been updated and supports more filter types.repetitions argument is not needed anymore for the new algorithm and has been removed.Inf values now should be used to specify no maximum for range filters (was -1).annotatedPeakList() method for formulas and compounds. Also used by reportMD for improved annotation peak tables.maxRtMSWidth and precursorMzWindow)generateComponentsNontarget, generateComponentsRAMClustR, generateCompoundsSirius, generateFormulasGenForm, generateFormulasSirius, generateMSPeakListsDA, generateMSPeakListsMzR, importFeatureGroupsBrukerPA
maxRtMSWidth argument to generateMSPeakListsDA, generateMSPeakListsMzR (now maxMSRtWindow) now specifies a retention time window ( +/- retention time feature) instead of total retention width around a feature. Hence, current input values should be halved.minSize and relMinReplicates (replaces ubiquitous for CAMERA) arguments. Note that their defaults may filter out (feature groups from) components. See their documentation for more info.patRoon.path.metFragCL to patRoon.path.MetFragCL. The old name still works for backward compatability.?generateCompounds.topMostFormulas argument for SIRIUS compound generation.reportPDF()/reportMD() now report only 5 top most candidate formulae by default (controlled by formulasTopMost argument).verifyDependencies() function to let the user verify if external tools can be found.dirs argument to convertMSFiles() was slightly changed: if TRUE (the default) the input can either be paths to analyses files or to directories containing the analyses files.featureGroups method for plot().reportMD(): Don’t plot Chord if <3 (non-empty) replicate groups are available.filter() methods now support negation by negate argument.reportMD(): added table with annotated fragments for compounds/formulasconsensus() updates
consensus() methods now support extracting unique data. This also replaces the unique() method that was defined for featureGroupsComparison.comparison() now automatically determines object names from algorithm (consistency with consensus() method for other objects).plotVenn() and plotUpSet() methods to compare different compounds or formulas objects.filter() method for components.MSMode="msms", now needs adduct argument.adduct class. This means that generateCompounds() and generateFormulas() now expect slightly differing arguments. Please see their manual pages.clearCache() now supports removal of caches via regular expressions.topMost and extraOpts arguments for SIRIUS formula/compound generation.filter() method for compounds now support generic scoring filtering and on elements of precursor and fragment formulae.?generateCompounds for more details (notably the Scorings section).plotSpec()
pruneMissingPrecursorMS option in ?generateMSPeakLists).retainPrecursorMSMS function arguments, see ?MSPeakLists and ?generateMSPeakLists).algorithm() and as.data.table()/as.data.frame() methods. The latter replaces and enhances the makeTable() (formulas class) and groupTable() (featureGroups class) methods.R to C++: significantly reduces time required for grouping large amount of features.revertDAAnalyses() function: brings back set of Bruker analyses to their unprocessed state.doFMF behaviour for DataAnalysis feature finding.formula and formulaConsensus classes are now merged: there is no need to call consensus() anymore after generateFormulas().calculateFeatures=FALSE). This can greatly speed up calulcation, especially with many analyses.filter() and as.data.table()/as.data.frame methods bring new functionalities related to filtering, extracting data and performing several processing steps commonly performed for organic matter (OM) characterization.frag_neutral_formula column). This ensures correct comparison when a consensus is made.reportCSV() now splits formulas for each feature group in separate CSV files (similar to compounds reporting).reportPDF() now actually includes formula annotations in annotated compound spectra when formulas are specified.file argument for clearCache()
generateMSPeakListsX where X is the algo).generateCompounds() and plotting functionality now uses averaged group peak lists instead of peak list of most intense analysis.plotSpec() method for MSPeakLists: plot (non-annotated) MS and MS/MS spectra.maxRtMSWidth argument used for peak list generation.maxRtMSWidth argument for mzR peak list generation had no effect.addAllDAEICs() function.mzWidth argument of addDAEIC() to mzWindow.convertMSFiles: changed interface with more options, parallelization and ProteoWizard support.getXcmsSet() is renamed to getXCMSSet()
findFeatures() / groupFeatures()
nintersects default for plotUpSet so that all intersections are plotted by default.features class objects now store number of isotopes found for each feature.fGroups <- fGroups[, groupNames(compounds)]
kableExtra package) that may cause memory leakage when reportMD() is called repeatedly.