Partitions data into K clusters, using the K-means algorithm.


mdl = kmeansFit(X, clusters);
mdl = kmeansFit(X, clusters, ctl);


NxP matrix, the training data.
Scalar, the number of clusters, or a matrix containing the initial centroids.
Optional input, kmeansControl structure with the following members.
  ctl.initMethod Scalar specifying the algorithm used to create the initial centroids.

Options include:
0 kmeans++ (default).
1 parallel k-means++.
2 k randomly selected observations.
  ctl.nStarts Scalar, the number of times to run the K-means algorithm with new starting centroids. Note: this input will be ignored if the clusters input is a starting centroid.
  ctl.seed Seed for the random number generator which creates the initial centroids. Note: this input will be ignored if the 'clusters' input is a starting centroid.
  ctl.tolerance Scalar, the convergence tolerance for the K-means algorithm.
  ctl.maxIters Scalar, the maximum number of iterations to allow each of the nStarts to run before forcing convergence.


A kmeansModel structure with the following components:
  mdl.centroids kxP matrix, containing the centroids with the lowest intra-cluster sum of squares.
  mdl.assignments Nx1 matrix, containing the centroid assignment for the corresponding observation of the input matrix.
  mdl.clusterSS Scalar, the sum of squared differences between each observation and its assigned centroid.
  mdl.elapsedIters Scalar, the number of iterations taken by the start with the lowest clusterSS.


Parallel Kmeans++ initialization. B. Bahmani, B. Moseley, A. Vattani, R. Kumar, S. Vassilvitskii. Scalable K-means++. Proceedings of the VLDB Endowment, 2012.

See also

kmeansFit, kmeansControlCreate

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.

Try GAUSS for 30 days for FREE

See what GAUSS can do for your data

© Aptech Systems, Inc. All rights reserved.

Privacy Policy