Properties for the sample analyses used in the workflow and utilities to automatically generate this information.
generateAnalysisInfo(
paths,
groups = "",
blanks = "",
concs = NULL,
norm_concs = NULL,
formats = MSFileFormats()
)
generateAnalysisInfoFromEnviMass(path)
A character vector containing one or more file paths that should be used for finding the analyses.
An (optional) character vector containing replicate groups and blanks, respectively (will be
recycled). If groups
is an empty character string (""
) the analysis name will be set as replicate
group.
An optional numeric vector containing concentration values for each analysis. Can be NA
if
unknown. If the length of concs
is less than the number of analyses the remainders will be set to NA
.
Set to NULL
to not include concentration data.
An optional numeric vector containing concentrations used for feature normalization (see the
Feature intensity normalization
section in the featureGroups documentation).
NA
values are allowed for analyses that should not be normalized (e.g. because no IS is present). If
the length of norm_concs
is less than the number of analyses the remainders will be set to NA
. Set to
NULL
to not include normalization concentration data.
A character vector of analyses file types to consider. Analyses not present in these formats will be
ignored. For valid values see MSFileFormats
.
The path of the enviMass project.
In patRoon a sample analysis, or simply analysis, refers to a single MS analysis file (sometimes
also called sample or file). The analysis information summarizes several properties for the
analyses, and is used in various steps throughout the workflow, such as findFeatures
, averaging
intensities of feature groups and blank subtraction. This information should be in a data.frame
, with the
following columns:
path
the full path to the directory of the analysis.
analysis
the file name without extension. Must be unique, even if the path
is
different.
group
name of replicate group. A replicate group is used to group analyses together that are
replicates of each other. Thus, the group
column for all analyses considered to be belonging to the same
replicate group should have an equal (but unique) value. Used for e.g. averaging and
filter
.
blank
all analyses within this replicate group are used by the featureGroups
method of
filter
for blank subtraction. Multiple entries can be entered by
separation with a comma.
conc
a numeric value specifying the 'concentration' for the analysis. This can be actually any kind of
numeric value such as exposure time, dilution factor or anything else which may be used to form a linear
relationship.
norm_conc
a numeric value specifying the normalization concentration for the analysis. See the
Feature intensity normalization
section in the featureGroups documentation) for
more details.
Most workflows steps work with mzXML
and mzML
file formats. However, some algorithms only support
support one format (e.g. findFeaturesOpenMS
, findFeaturesEnviPick
) or a
proprietary format (findFeaturesBruker
). To mix such algorithms in the same workflow, the analyses
should be present in all required formats within the same directory as specified by the path
column.
Each analysis should only be specified once in the analysis information, even if multiple file formats are
available. The path
and analysis
columns are internally used by patRoon to automatically find the
path of analysis files with the required format.
The group
column is mandatory and needs to be non-empty for each analysis. The blank
column
should also be present, however, this may be empty (""
) for analyses where no blank subtraction should occur.
The conc
column is only required when obtaining regression information is desired with the
as.data.table
method. Similarly, the norm_conc
is only
necessary for the normInts
method.
generateAnalysisInfo
is an utility function that automatically generates a data.frame
with
analysis information. It scans the directories specified from the paths
argument for analyses, and uses this
to automatically fill in the analysis
and path
columns. Furthermore, this function also correctly
handles analyses which are available in multiple formats.
generateAnalysisInfoFromEnviMass
loads analysis information
from an enviMass project. Note: this funtionality has only been
tested with older versions of enviMass.