Conversion of MS analysis files between several open and closed data formats.
getMSConversionTypes(algorithm, direction)
getMSConversionFormats(algorithm, direction, type = NULL)
convertMSFilesPWiz(
inFiles,
outFiles,
formatTo = "mzML",
centroid = TRUE,
IMS = FALSE,
minIntensity = 0,
filters = NULL,
extraOpts = NULL,
PWizBatchSize = 1
)
convertMSFilesOpenMS(inFiles, outFiles, formatTo = "mzML", extraOpts = NULL)
convertMSFilesBruker(inFiles, outFiles, formatTo = "mzML", centroid = TRUE)
convertMSFilesIMSCollapse(
inFiles,
outFiles,
typeFrom,
formatTo = "mzML",
mzRange = NULL,
mobilityRange = NULL,
smoothWindow = 0,
halfWindow = 2,
maxGap = 0.005,
clusterMethod = "distance_mean",
mzWindow = defaultLim("mz", "medium"),
minIntensityIMS = 0,
includeMSMS = FALSE,
...
)
convertMSFilesTIMSCONVERT(
inFiles,
outFiles,
formatTo = "mzML",
centroid = TRUE,
centroidRaw = FALSE,
IMS = FALSE,
extraOpts = NULL,
virtualenv = "patRoon-TIMSCONVERT"
)
convertMSFilesPaths(
files,
formatFrom,
formatTo = "mzML",
outPath = NULL,
dirs = TRUE,
overwrite = FALSE,
algorithm = "pwiz",
...
)
convertMSFiles(
anaInfo,
typeFrom = "raw",
typeTo = "centroid",
formatFrom,
formatTo = "mzML",
overwrite = FALSE,
algorithm = "pwiz",
centroidVendor = TRUE,
...
)Either "pwiz" (ProteoWizard), "openms", "bruker" (Bruker DataAnalysis) ,
"imscollapse" or "timsconvert".
A character specifying the direction of conversion. Either "input" or "output".
The type of the input or output files. See getMSConversionTypes for the supported
types.
A character vector with input and output files, respectively. Lengths and order should
be the same.
Set to TRUE to perform centroiding.
For convertMSFilesPWiz: the value may be "vendor" to perform centroiding with the vendor algorithm or
"cwt" to use ProteoWizard's wavelet algorithm.
How to handle IMS data.
For convertMSFilesPWiz: if TRUE then IMS data is exported and spectra for each IMS frame are combined
into a single spectrum (using the –combineIonMobilitySpectra option), which is the format supported by
patRoon. Set to NA to collapse the IMS data by scan summing, which mimics 'regular' HRMS data. Set to
FALSE for non-IMS data. NOTE: do not set IMS=FALSE if the data has IMS data. This will
result in very large files where MS spectra are not combined by frame, which cannot be properly read by
patRoon.
For convertMSFilesTIMSCONVERT: set to TRUE to keep IMS data or FALSE to exclude IMS data to
mimic 'regular' LC-MS data.
The minimum intensity of the mass peaks to be kept. Applying an intensity threshold is especially beneficial to reduce export file size when there are a lot of zero or very low intensity mass peaks. NOTE this currently does not work well with IMS data.
A character vector specifying one or more filters to msconvert. The elements of the
specified vector are directly passed to the --filter option (see
here)
A character vector specifying any extra command line parameters passed to msconvert
or FileConverter. Set to NULL to ignore. For options: see
FileConverter
and msconvert.
The number of analyses to process by a single call to msconvert. Usually a value of
one is most efficient. Set to zero to run all analyses all at once from a single call.
A two sized vector specifying the m/z and mobility range to be exported, respectively.
Set to NULL to export the full range.
Centroiding parameters: see getDefAvgPListParams for details.
NOTE: As described there, maxGap may need to be increased for Agilent instruments (e.g.
0.01).
The clustering method and window (see clustering parameters) used to find and combine MS/MS spectra of precursors with close m/z.
The minimum intensity for MS peaks in raw data.
Set to TRUE to include MS/MS spectra in the output. For IMS workflows where IMS data is
only collapsed to produce compatible data files for feature detection, MS/MS data are not needed and can be
excluded to reduce computational times and file sizes. Setting includeMSMS=TRUE is primarily intended to
perform 'classical LC-MS workflows' with IMS data.
For convertMSFilesIMSCollapse: further arguments passed to
mzR::writeMSData.
For convertMSFilesPaths and convertMSFiles: further arguments passed to algorithm specific conversion
functions.
Only applicable if IMS=FALSE. Sets the mode parameter of TIMSCONVERT:
raw if centroidRaw=TRUE or centroid if centroidRaw=FALSE. See
https://gtluu.github.io/timsconvert/local.html#notes-on-mode-parameter for more details.
The virtual Python environment in which TIMSCONVERT is installed. This is passed to
reticulate::use_virtualenv, which will ensure that the
TIMSCONVERT command line utility can be found by patRoon. Set to NULL to skip this step.
The files argument should be a character vector with input files. If files
contains directories and dirs=TRUE then files from these directories are also considered.
The input or output format. See getMSConversionFormats for the supported formats.
A character vector specifying directories that should be used for the output. Will be re-cycled if
necessary. If NULL, output directories will be kept the same as the input directories.
Should existing destination file be overwritten (TRUE) or not (FALSE)?
An analysis info table that is used to retrieve the input files. The
paths set by path_centroid, path_profile and path_ims are used to determine the output
directories. This function automatically determines if and how centroiding and IMS conversions should be applied.
Only for algorithm="pwiz": whether centroiding should be performed with vendor
algorithms.
getMSConversionTypes returns a character with all supported input or output conversion types
for an algorithm.
getMSConversionFormats returns a character with all supported input or output conversion
formats for an algorithm, optionally filtered by the given type.
convertMSFilesPWiz converts and pre-treats HRMS data with the msconvert tool from
ProteoWizard.
convertMSFilesOpenMS converts HRMS data with the FileConvert tool of
OpenMS.
convertMSFilesBruker converts and pre-treats Bruker HRMS data with Bruker DataAnalysis. Note that
TIMS data currently is not supported.
convertMSFilesIMSCollapse is used to convert IMS data to data that mimics 'regular' HRMS data by
collapsing the IMS dimension. The raw data interface of patRoon first sums up all spectra within each IMS
frame, performs centroiding and finally exports the resulting data with the
mzR::writeMSData function. Several thresholds can be set to speed up the conversion
process and reduce noise, but care should be taken that no mass peaks of interest are lost.
convertMSFilesTIMSCONVERT converts and pre-treats TIMS data with
TIMSCONVERT. The installTIMSCONVERT function can be used
to automatically install TIMSCONVERT.
convertMSFilesPaths is a wrapper function that simplifies the use of algorithm specific MS conversion
functions, such as convertMSFilesPWiz, and convertMSFilesTIMSCONVERT.
convertMSFiles is a wrapper function that simplifies the use of convertMSFilesPaths.
convertMSFilesPWiz, convertMSFilesOpenMS and convertMSFilesTIMSCONVERT uses multiprocessing to parallelize
computations. Please see the parallelization section in the handbook for
more details and patRoon options for configuration
options.
The raw data interface of patRoon is used by convertMSFilesIMSCollapse to
process HRMS (or IMS-HRMS) data. Please see its documentation for more information on the supported
formats and available configuration options.
Rost HL, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, Andreotti S, Ehrlich H, Gutenbrunner P, Kenar E, Liang X, Nahnsen S, Nilse L, Pfeuffer J, Rosenberger G, Rurik M, Schmitt U, Veit J, Walzer M, Wojnar D, Wolski WE, Schilling O, Choudhary JS, Malmstrom L, Aebersold R, Reinert K, Kohlbacher O (2016).
“OpenMS: a flexible open-source software platform for mass spectrometry data analysis.”
Nature Methods, 13(9), 741–748.
doi:10.1038/nmeth.3959
.
Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak M, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012).
“A cross-platform toolkit for mass spectrometry and proteomics.”
Nature Biotechnology, 30(10), 918–920.
doi:10.1038/nbt.2377
.
Luu GT, Freitas MA, Lizama-Chamu I, McCaughey CS, Sanchez LM, Wang M (2022).
“TIMSCONVERT: a workflow to convert trapped ion mobility data to open data formats.”
Bioinformatics, 38(16), 4046–4047.
ISSN 1367-4811, doi:10.1093/bioinformatics/btac419
, http://dx.doi.org/10.1093/bioinformatics/btac419.
Chambers, C. M, Maclean, Brendan, Burke, Robert, Amodei, Dario, Ruderman, L. D, Neumann, Steffen, Gatto, Laurent, Fischer, Bernd, Pratt, Brian, Egertson, Jarrett, Hoff, Katherine, Kessner, Darren, Tasman, Natalie, Shulman, Nicholas, Frewen, Barbara, Baker, A. T, Brusniak, Mi-Youn, Paulse, Christopher, Creasy, David, Flashner, Lisa, Kani, Kian, Moulding, Chris, Seymour, L. S, Nuwaysir, M. L, Lefebvre, Brent, Kuhlmann, Frank, Roark, Joe, Rainer, Paape, Detlev, Suckau, Hemenway, Tina, Huhmer, Andreas, Langridge, James, Connolly, Brian, Chadick, Trey, Holly, Krisztina, Eckels, Josh, Deutsch, W. E, Moritz, L. R, Katz, E. J, Agus, B. D, MacCoss, Michael, Tabb, L. D, Mallick, Parag (2012).
“A cross-platform toolkit for mass spectrometry and proteomics.”
Nat Biotech, 30(10), 918–920.
doi:10.1038/nbt.2377
, http://dx.doi.org/10.1038/nbt.2377.
Keller A, Eng J, Zhang N, Li X, Aebersold R (2005).
“A uniform proteomics MS/MS analysis platform utilizing open XML file formats.”
Mol Syst Biol.
Kessner D, Chambers M, Burke R, Agus D, Mallick P (2008).
“ProteoWizard: open source software for rapid proteomics tools
development.”
Bioinformatics, 24(21), 2534–2536.
doi:10.1093/bioinformatics/btn323
.
Martens L, Chambers M, Sturm M, Kessner D, Levander F, Shofstahl J, Tang WH, Rompp A, Neumann S, Pizarro AD, Montecchi-Palazzi L, Tasman N, Coleman M, Reisinger F, Souda P, Hermjakob H, Binz P, Deutsch EW (2010).
“mzML - a Community Standard for Mass Spectrometry Data.”
Mol Cell Proteomics.
doi:10.1074/mcp.R110.000133
.
Pedrioli PGA, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti RH, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R (2004).
“A common open representation of mass spectrometry data and its
application to proteomics research.”
Nat Biotechnol, 22(11), 1459–1466.
doi:10.1038/nbt1031
.