PSPP: QUICK CLUSTER

15.13 QUICK CLUSTER

QUICK CLUSTER var_list
      [/CRITERIA=CLUSTERS(k) [MXITER(max_iter)]]
      [/MISSING={EXCLUDE,INCLUDE} {LISTWISE, PAIRWISE}]

The QUICK CLUSTER command performs k-means clustering on the dataset. This is useful when you wish to allocate cases into clusters of similar values and you already know the number of clusters.

The minimum specification is ‘QUICK CLUSTER’ followed by the names of the variables which contain the cluster data. Normally you will also want to specify /CRITERIA=CLUSTERS(k) where k is the number of clusters. If this is not given, then k defaults to 2.

The command uses an iterative algorithm to determine the clusters for each case. It will continue iterating until convergence, or until max_iter iterations have been done. The default value of max_iter is 2.

The MISSING subcommand determines the handling of missing variables. If INCLUDE is set, then user-missing values are considered at their face value and not as missing values. If EXCLUDE is set, which is the default, user-missing values are excluded as well as system-missing values.

If LISTWISE is set, then the entire case is excluded from the analysis whenever any of the clustering variables contains a missing value. If PAIRWISE is set, then a case is considered missing only if all the clustering variables contain missing values. Otherwise it is clustered on the basis of the non-missing values. The default is LISTWISE.