8.2 Feature intensity normalization
Feature intensities are often compared between sample analyses, for instance, to evaluate trends between sample points. However, matrix effects, varying detector sensitivity and differences in analysed sample amount may complicate such comparison. For this reason, it may be desired to normalize the feature intensities.
The normInts()
function is used to normalize feature intensities (peak heights and areas). Two different types are supported:
- Feature normalization: normalization occurs by intensities within the same sample analysis
- Group normalization: normalization occurs by intensities among features within the same group
Both normalization types can be combined.
8.2.1 Feature normalization
Feature normalization itself supports the following normalization methods:
Method | Usage | Description |
---|---|---|
TIC | normInts(featNorm = "tic", ...) |
Normalizes by the combined intensity of all features, also known as the Total Ion Current (TIC). |
Internal Standard | normInts(featNorm = "istd", ...) |
Uses internal standards (IS) to normalize feature intensities. |
Concentration | normInts(featNorm = "conc", ...) |
Normalizes feature intensities of a sample analysis by its normalization concentration (explained below). |
None | normInts(featNorm = "none", ...) |
Performs no feature normalization. Set this if you only want to perform group normalization (discussed in the next section). |
8.2.1.1 Normalization concetration
All methods (except "none"
) are influenced by the normalization concentration, which is a property set for each sample analysis. For IS normalization, this should equal the concentration of the IS present in the sample. Otherwise the normalization concentration resembles the injected sample amount. The normalization concentration is defined in the norm_conc
column of the analysis information. For example:
# obtain analysis information as usual, but add normalization concentrations.
# The blanks are set to NA, and will therefore not be normalized.
generateAnalysisInfo(paths = patRoonData::exampleDataPath(),
groups = c(rep("solvent", 3), rep("standard", 3)),
blanks = "solvent",
norm_concs = c(NA, NA, NA, 2, 2, 1))
#> path analysis group blank norm_conc
#> 1 /usr/local/lib/R/site-library/patRoonData/extdata/pos solvent-pos-1 solvent solvent NA
#> 2 /usr/local/lib/R/site-library/patRoonData/extdata/pos solvent-pos-2 solvent solvent NA
#> 3 /usr/local/lib/R/site-library/patRoonData/extdata/pos solvent-pos-3 solvent solvent NA
#> 4 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-1 standard solvent 2
#> 5 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-2 standard solvent 2
#> 6 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-3 standard solvent 1
The normalization concentration does not need to be an absolute value. In the end, what matters are the relative numbers between the sample analyses. For example, if the concentrations for two analyses are c(1, 2)
or c(1.5, 3.0)
the normalization occurs the same. Setting the concentration to NA
(or 0
) will skip normalization for an analysis. If the normalization concentration is absent from the analysis information it will be defaulted to 1
.
8.2.1.2 Internal standard normalization
For IS normalization an internal standard list should be specified with the properties of the internal standards. Essentially, the format of this list is exactly the same as a suspect list. Example lists can be found in the patRoonData
package:
#> name formula rt
#> 1 1H-benzotriazole-D4 C6[2]H4HN3 268.1
#> 2 Atenolol-D7 C14[2]H7H15N2O3 213.5
#> 3 Atrazine-D5 C8[2]H5H9ClN5 336.5
#> 4 Bezafibrate-D6 C19[2]H6H14ClNO4 351.7
#> 5 Climbazole-D4 C15[2]H4H13ClN2O2 359.1
As can be seen from above, labelled isotopes can be specified with square brackets, e.g. [2]H for deuterium.
The next step is to perform the normalization with normInts()
:
fGroupsNorm <- normInts(fGroups, featNorm = "istd", standards = patRoonData::ISTDListPos, adduct = "[M+H]+",
ISTDRTWindow = 20, ISTDMZWindow = 200, minISTDs = 2)
This will do the following:
- Perform a suspect screening to find the specified IS (
standards
argument). - Remove the IS candidates which are absent in one or more of the analyses to be normalized.
- Select IS candidates for each feature group, based on close retention time (
ISTDRTWindow
argument), m/z (ISTDMZWindow
argument) and a minimum number (minISTDs
). If the number of IS candidates within specified retention time and m/z windows is belowminISTDs
, the close(st) candidate(s) outside these windows are additionally chosen. - Normalization of features is performed with the combined IS intensities.
To evaluate the assignments for a particular feature group, the internalStandardAssignments()
function and plotGraph()
functions can be used:
fg <- names(fGroupsNorm)[2]
internalStandardAssignments(fGroupsNorm)[[fg]] # IS assignments for 2nd feature group
#> [1] "M221_R336_292" "M284_R323_569" "M213_R340_263"
8.2.2 Group normalization
Normalizing feature intensities among group member is easily performed by setting groupNorm=TRUE
:
8.2.3 Using normalized intensities
The normalized intensity (peak heigh/area) values can easily be obtained with as.data.table()
:
#> group ret mz standard-pos-1 standard-pos-2 standard-pos-3 ISTD_assigned
#> <char> <num> <num> <num> <num> <num> <char>
#> 1: M109_R192_20 191.8717 109.0759 2.328459 2.1068991 0.9688233 M280_R212_561,M274_R214_532
#> 2: M111_R330_23 330.4078 111.0439 0.476554 0.4156571 0.2109971 M221_R336_292,M284_R323_569,M213_R340_263
#> 3: M114_R269_25 268.6906 114.0912 1.006808 1.1271519 0.5722654 M300_R262_608,M275_R294_537
#> 4: M116_R317_29 316.7334 116.0527 3.804086 3.8240928 2.1151499 M284_R323_569,M198_R310_215,M285_R301_570,M221_R336_292
#> 5: M120_R268_30 268.4078 120.0554 3.376374 3.0432604 1.4157580 M300_R262_608,M275_R294_537
# can be combined with other parameters
as.data.table(fGroupsNorm, normalized = TRUE, average = TRUE, areas = TRUE)[1:5]
#> group ret mz standard ISTD_assigned
#> <char> <num> <num> <num> <char>
#> 1: M109_R192_20 191.8717 109.0759 3.2597655 M280_R212_561,M274_R214_532
#> 2: M111_R330_23 330.4078 111.0439 0.2753524 M221_R336_292,M284_R323_569,M213_R340_263
#> 3: M114_R269_25 268.6906 114.0912 0.8325869 M300_R262_608,M275_R294_537
#> 4: M116_R317_29 316.7334 116.0527 2.6500817 M284_R323_569,M198_R310_215,M285_R301_570,M221_R336_292
#> 5: M120_R268_30 268.4078 120.0554 1.8965138 M300_R262_608,M275_R294_537
# feature values (no need to set normalized=TRUE)
as.data.table(fGroupsNorm, features = TRUE)[1:5, .(group, analysis, intensity_rel, area_rel)]
#> group analysis intensity_rel area_rel
#> <char> <char> <num> <num>
#> 1: M109_R192_20 standard-pos-1 2.3284588 3.9777827
#> 2: M109_R192_20 standard-pos-2 2.1068991 4.0198440
#> 3: M109_R192_20 standard-pos-3 0.9688233 1.7816697
#> 4: M111_R330_23 standard-pos-1 0.4765540 0.3352008
#> 5: M111_R330_23 standard-pos-2 0.4156571 0.3251259
Several other patRoon
functions also accept the normalized
argument to use normalized data, such as plotInt()
(discussed here), plotVolcano()
(discussed here) and generateComponents()
with intensity clustering (discussed here).
8.2.4 Default normalization
If normalized data is requested (normalized=TRUE
, see previous section) and normInts()
was not called on the feature group data, a default normalization will occur. This is nothing more than a group normalization (normInts(groupNorm=TRUE, ...)
), and was mainly implemented to ensure backwards compatibility with previous patRoon
versions.