8.2 Feature intensity normalization

Feature intensities are often compared between sample analyses, for instance, to evaluate trends between sample points. However, matrix effects, varying detector sensitivity and differences in analysed sample amount may complicate such comparison. For this reason, it may be desired to normalize the feature intensities.

The normInts() function is used to normalize feature intensities (peak heights and areas). Two different types are supported:

  1. Feature normalization: normalization occurs by intensities within the same sample analysis
  2. Group normalization: normalization occurs by intensities among features within the same group

Both normalization types can be combined.

8.2.1 Feature normalization

Feature normalization itself supports the following normalization methods:

Method Usage Description
TIC normInts(featNorm = "tic", ...) Normalizes by the combined intensity of all features, also known as the Total Ion Current (TIC).
Internal Standard normInts(featNorm = "istd", ...) Uses internal standards (IS) to normalize feature intensities.
Concentration normInts(featNorm = "conc", ...) Normalizes feature intensities of a sample analysis by its normalization concentration (explained below).
None normInts(featNorm = "none", ...) Performs no feature normalization. Set this if you only want to perform group normalization (discussed in the next section).

8.2.1.1 Normalization concetration

All methods (except "none") are influenced by the normalization concentration, which is a property set for each sample analysis. For IS normalization, this should equal the concentration of the IS present in the sample. Otherwise the normalization concentration resembles the injected sample amount. The normalization concentration is defined in the norm_conc column of the analysis information. For example:

# obtain analysis information as usual, but add normalization concentrations.
# The blanks are set to NA, and will therefore not be normalized.
generateAnalysisInfo(paths = patRoonData::exampleDataPath(),
                     groups = c(rep("solvent", 3), rep("standard", 3)),
                     blanks = "solvent",
                     norm_concs = c(NA, NA, NA, 2, 2, 1))
#>                                                    path       analysis    group   blank norm_conc
#> 1 /usr/local/lib/R/site-library/patRoonData/extdata/pos  solvent-pos-1  solvent solvent        NA
#> 2 /usr/local/lib/R/site-library/patRoonData/extdata/pos  solvent-pos-2  solvent solvent        NA
#> 3 /usr/local/lib/R/site-library/patRoonData/extdata/pos  solvent-pos-3  solvent solvent        NA
#> 4 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-1 standard solvent         2
#> 5 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-2 standard solvent         2
#> 6 /usr/local/lib/R/site-library/patRoonData/extdata/pos standard-pos-3 standard solvent         1

The normalization concentration does not need to be an absolute value. In the end, what matters are the relative numbers between the sample analyses. For example, if the concentrations for two analyses are c(1, 2) or c(1.5, 3.0) the normalization occurs the same. Setting the concentration to NA (or 0) will skip normalization for an analysis. If the normalization concentration is absent from the analysis information it will be defaulted to 1.

8.2.1.2 Internal standard normalization

For IS normalization an internal standard list should be specified with the properties of the internal standards. Essentially, the format of this list is exactly the same as a suspect list. Example lists can be found in the patRoonData package:

patRoonData::ISTDListPos[1:5, ]
#>                  name           formula    rt
#> 1 1H-benzotriazole-D4        C6[2]H4HN3 268.1
#> 2         Atenolol-D7   C14[2]H7H15N2O3 213.5
#> 3         Atrazine-D5     C8[2]H5H9ClN5 336.5
#> 4      Bezafibrate-D6  C19[2]H6H14ClNO4 351.7
#> 5       Climbazole-D4 C15[2]H4H13ClN2O2 359.1

As can be seen from above, labelled isotopes can be specified with square brackets, e.g. [2]H for deuterium.

The next step is to perform the normalization with normInts():

fGroupsNorm <- normInts(fGroups, featNorm = "istd", standards = patRoonData::ISTDListPos, adduct = "[M+H]+",
                        ISTDRTWindow = 20, ISTDMZWindow = 200, minISTDs = 2)

This will do the following:

  • Perform a suspect screening to find the specified IS (standards argument).
  • Remove the IS candidates which are absent in one or more of the analyses to be normalized.
  • Select IS candidates for each feature group, based on close retention time (ISTDRTWindow argument), m/z (ISTDMZWindow argument) and a minimum number (minISTDs). If the number of IS candidates within specified retention time and m/z windows is below minISTDs, the close(st) candidate(s) outside these windows are additionally chosen.
  • Normalization of features is performed with the combined IS intensities.

To evaluate the assignments for a particular feature group, the internalStandardAssignments() function and plotGraph() functions can be used:

fg <- names(fGroupsNorm)[2]
internalStandardAssignments(fGroupsNorm)[[fg]] # IS assignments for 2nd feature group
#> [1] "M221_R336_292" "M284_R323_569" "M213_R340_263"
plotGraph(fGroupsNorm) # interactively explore assignments

8.2.1.3 Other methods

Like IS normalization, other feature normalization methods also occurs with normInts():

fGroupsNorm <- normInts(fGroups, featNorm = "tic") # TIC normalization
fGroupsNorm <- normInts(fGroups, featNorm = "conc") # Concentration normalization

8.2.2 Group normalization

Normalizing feature intensities among group member is easily performed by setting groupNorm=TRUE:

# only perform group normalization
fGroupsNorm <- normInts(fGroups, featNorm = "none", groupNorm = TRUE)
# first perform TIC feature normalization and then group normalization
fGroupsNorm <- normInts(fGroups, featNorm = "tic", groupNorm = TRUE)

8.2.3 Using normalized intensities

The normalized intensity (peak heigh/area) values can easily be obtained with as.data.table():

as.data.table(fGroupsNorm, normalized = TRUE)[1:5]
#>           group      ret       mz standard-pos-1 standard-pos-2 standard-pos-3                                           ISTD_assigned
#>          <char>    <num>    <num>          <num>          <num>          <num>                                                  <char>
#> 1: M109_R192_20 191.8717 109.0759       2.328459      2.1068991      0.9688233                             M280_R212_561,M274_R214_532
#> 2: M111_R330_23 330.4078 111.0439       0.476554      0.4156571      0.2109971               M221_R336_292,M284_R323_569,M213_R340_263
#> 3: M114_R269_25 268.6906 114.0912       1.006808      1.1271519      0.5722654                             M300_R262_608,M275_R294_537
#> 4: M116_R317_29 316.7334 116.0527       3.804086      3.8240928      2.1151499 M284_R323_569,M198_R310_215,M285_R301_570,M221_R336_292
#> 5: M120_R268_30 268.4078 120.0554       3.376374      3.0432604      1.4157580                             M300_R262_608,M275_R294_537
# can be combined with other parameters
as.data.table(fGroupsNorm, normalized = TRUE, average = TRUE, areas = TRUE)[1:5]
#>           group      ret       mz  standard                                           ISTD_assigned
#>          <char>    <num>    <num>     <num>                                                  <char>
#> 1: M109_R192_20 191.8717 109.0759 3.2597655                             M280_R212_561,M274_R214_532
#> 2: M111_R330_23 330.4078 111.0439 0.2753524               M221_R336_292,M284_R323_569,M213_R340_263
#> 3: M114_R269_25 268.6906 114.0912 0.8325869                             M300_R262_608,M275_R294_537
#> 4: M116_R317_29 316.7334 116.0527 2.6500817 M284_R323_569,M198_R310_215,M285_R301_570,M221_R336_292
#> 5: M120_R268_30 268.4078 120.0554 1.8965138                             M300_R262_608,M275_R294_537
# feature values (no need to set normalized=TRUE)
as.data.table(fGroupsNorm, features = TRUE)[1:5, .(group, analysis, intensity_rel, area_rel)]
#>           group       analysis intensity_rel  area_rel
#>          <char>         <char>         <num>     <num>
#> 1: M109_R192_20 standard-pos-1     2.3284588 3.9777827
#> 2: M109_R192_20 standard-pos-2     2.1068991 4.0198440
#> 3: M109_R192_20 standard-pos-3     0.9688233 1.7816697
#> 4: M111_R330_23 standard-pos-1     0.4765540 0.3352008
#> 5: M111_R330_23 standard-pos-2     0.4156571 0.3251259

Several other patRoon functions also accept the normalized argument to use normalized data, such as plotInt() (discussed here), plotVolcano() (discussed here) and generateComponents() with intensity clustering (discussed here).

8.2.4 Default normalization

If normalized data is requested (normalized=TRUE, see previous section) and normInts() was not called on the feature group data, a default normalization will occur. This is nothing more than a group normalization (normInts(groupNorm=TRUE, ...)), and was mainly implemented to ensure backwards compatibility with previous patRoon versions.