Add hierarchical clustering results to overall data — addHierarchicalClusterVariables • cypro

Adds hierarchical clustering results in form of grouping variables to the object's overall data - making them available for the across-argument..

addHierarchicalClusterVariables(
  object,
  variable_set,
  phase = NULL,
  method_dist = NULL,
  method_aggl = NULL,
  k = NULL,
  h = NULL,
  verbose = NULL
)

Arguments

object	A valid cypro object.
variable_set	Character value. Denotes the variable set of interest. Use `getVariableSetNames()` to obtain all names of currently stored variable sets in your object.
phase	Character or numeric. If character, the ordinal value referring to the phase of interest (e.g. 'first', 'second' etc.). referring to the phase of interest or 'all'. If numeric, the number referring to the phase. If set to NULL takes the phase denoted as default with `adjustDefault()`. Ignored if the experiment design contains only one phase.
method_dist	Character vector (or value see details for more.) Denotes the distance method(s) of interest (e.g. 'euclidean' or 'manhattan'). Use `validDistanceMethods()` to obtain all valid input options.
method_aggl	Character vector (or value see details for more.) Denotes the agglomeration method(s) of interest according to which the existing distance matrices are agglomerated to hierarchical trees. Use `validAgglomerationMethods()` to obtain all valid input options.
k	Numeric vector. Denotes the exact number of clusters in which the tree created according to the distance- and agglomeration method is supposed to be cut.
h	Numeric vector. Denotes the heights at which the hierarchical tree created according to the distance- and agglomeration method is supposed to be cut.
verbose	Logical. If set to TRUE informative messages regarding the computational progress will be printed. (Warning messages will always be printed.)

Value

An updated cypro object that contains the data added.

Details

The last step of the hierarchical clustering pipeline. This function iterates over all combinations of method_dist, method_aggl, k and h and adds the respective clustering variables to the object's overall data named according to the following syntax: hcl_method_dist_method_aggl_k/h_k/h_(variable_set). This naming concept results in somewhat bulky but unambiguous clustering names. You can always rename grouping variables with renameClusterDf().

Use getGroupingVariableNames() afterwards to obtain all grouping variables.