The Tecnoprom Core

Advanced and not-so-advanced technology

See my new website: DSSTI

K Means Clustering Algorithm

Introduction

K Means is an algorithm of the family of unsupervised learning algorithms. It separates a cloud of points into several groups.

Algorithm

K Means divides the space in regions the same way a Voronoi diagram does. The algorithm is really simple:

Initialise the K means to random points in the sample
Repeat until convergence:
    For each point in the data set:
        Assign the point to its nearest K mean
    For each K mean:
        Move the K mean to the average location of its assigned points

Example code

The following Python program implements K Means clustering in a 2D environment. It uses PyGame for visualisation. First it creates several random clouds of points and then it triers to fit several K means to them. Note that not every cloud of points can easily get a K-means clasification, and not every K-means clasification is actually useful.

k_means.py