Hierarchical clustering of compounds

Perform hierarchical clustering of structure candidates based on chemical similarity and obtain overall structural information based on the maximum common structure (MCS).

makeHCluster(obj, method = "complete", ...)

# S4 method for class 'compounds'
makeHCluster(
  obj,
  method,
  fpType = "extended",
  fpSimMethod = "tanimoto",
  maxTreeHeight = 1,
  deepSplit = TRUE,
  minModuleSize = 1
)

Arguments

obj: The compounds object to be clustered.
method: The clustering method passed to hclust.
...: further arguments specified to methods.
fpType: The type of structural fingerprint that should be calculated. See the type argument of the get.fingerprint function of rcdk.
fpSimMethod: The method for calculating similarities (i.e. not dissimilarity!). See the method argument of the fp.sim.matrix function of the fingerprint package.
maxTreeHeight, deepSplit, minModuleSize: Arguments used by cutreeDynamicTree.

Value

makeHCluster returns an compoundsCluster object.

Details

Often many possible chemical structure candidates are found for each feature group when performing compound annotation. Therefore, it may be useful to obtain an overview of their general structural properties. One strategy is to perform hierarchical clustering based on their chemical (dis)similarity, for instance, using the Tanimoto score. The resulting clusters can then be characterized by evaluating their maximum common substructure (MCS).

makeHCluster performs hierarchical clustering of all structure candidates for each feature group within a compounds object. The resulting dendrograms are automatically cut using the cutreeDynamicTree function from the dynamicTreeCut package. The returned compoundsCluster object can then be used, for instance, for plotting dendrograms and MCS structures and manually re-cutting specific clusters.

Source

The methodology applied here has been largely derived from chemclust.R from the metfRag package and the package vignette of rcdk.

References

Guha R (2007). “Chemical Informatics Functionality in R.” Journal of Statistical Software, 18(6).

Arguments

Value

Details

Source

References

See also