Add kmeans clustering results to overall data — addKmeansClusterVariables • cypro

Adds the clustering results of computeKmeansCluster() in form of grouping variables to the object's overall data - making them available for the across- argument.

addKmeansClusterVariables(
  object,
  variable_set,
  k,
  phase = NULL,
  method_kmeans = NULL,
  verbose = NULL
)

Arguments

object	A valid cypro object.
variable_set	Character value. Denotes the variable set of interest. Use `getVariableSetNames()` to obtain all names of currently stored variable sets in your object.
k	Numeric vector. All k-values of interest.
phase	Character or numeric. If character, the ordinal value referring to the phase of interest (e.g. 'first', 'second' etc.). referring to the phase of interest or 'all'. If numeric, the number referring to the phase. If set to NULL takes the phase denoted as default with `adjustDefault()`. Ignored if the experiment design contains only one phase.
method_kmeans	Character vector (or value see details for more.) Denotes the algorithms of interest. Defaults to 'Hartigan-Wong'. Use `validKmeansMethods()` to obtain all valid input options.
verbose	Logical. If set to TRUE informative messages regarding the computational progress will be printed. (Warning messages will always be printed.)

Value

An updated cypro object that contains the data added.

Details

The last step of the kmeans clustering pipeline. This function iterates over all combinations of method_kmeans and k and adds the respective clustering variables to the object's overall data named according to the following syntax: kmeans_method_kmeans_k_k_(variable_set). This naming concept results in somewhat bulky but unambiguous clustering names. You can always rename grouping variables with renameClusterDf().

Use getGroupingVariableNames() afterwards to obtain all grouping variables.