R/generics.R
, R/compounds-cluster.R
compounds-cluster.Rd
Perform hierarchical clustering of structure candidates based on chemical similarity and obtain overall structural information based on the maximum common structure (MCS).
makeHCluster(obj, method = "complete", ...)
# S4 method for class 'compounds'
makeHCluster(
obj,
method,
fpType = "extended",
fpSimMethod = "tanimoto",
maxTreeHeight = 1,
deepSplit = TRUE,
minModuleSize = 1
)
The compounds
object to be clustered.
The clustering method passed to hclust
.
further arguments specified to methods.
The type of structural fingerprint that should be calculated. See the type
argument of the
get.fingerprint
function of rcdk.
The method for calculating similarities (i.e. not dissimilarity!). See the method
argument
of the fp.sim.matrix
function of the fingerprint package.
Arguments used by
cutreeDynamicTree
.
makeHCluster
returns an compoundsCluster
object.
Often many possible chemical structure candidates are found for each feature group when performing compound annotation. Therefore, it may be useful to obtain an overview of their general structural properties. One strategy is to perform hierarchical clustering based on their chemical (dis)similarity, for instance, using the Tanimoto score. The resulting clusters can then be characterized by evaluating their maximum common substructure (MCS).
makeHCluster
performs hierarchical clustering of all
structure candidates for each feature group within a
compounds
object. The resulting dendrograms are automatically
cut using the cutreeDynamicTree
function from the
dynamicTreeCut package. The returned
compoundsCluster
object can then be used, for instance, for
plotting dendrograms and MCS structures and manually re-cutting specific
clusters.
The methodology applied here has been largely derived from
chemclust.R
from the metfRag package and the package vignette
of rcdk.
rcdk1
compoundsCluster