k-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. Given data set is classified through a certain number of clusters (assume k clusters) fixed apriori. The idea is to define k centers, one for each cluster. These centers should be placed in a cunning way because of different location causes different result. So, the better choice is to place them as much as possible far away from each other. The next step is to take each point belonging to a given data set and associate it to the nearest center.When no point is pending, the first step is completed and an early group age is done.

At this point we need to re-calculate k new centroids as barycenter of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new center. A loop has been generated. As a result of this loop we may notice that the k centers change their location step by step until no more changes are done or in other words centers do not move any more. Finally, this algorithm aims at minimizing an objective function know as squared error function.

## Steps

__Algorithmic steps for k-means clustering __

Let X = {x_{1},x_{2},x_{3},……..,x_{n}} be the set of data points and V = {v_{1},v_{2},…….,v_{c}} be the set of centers.

1) Randomly select *‘c’* cluster centers.

2) Calculate the distance between each data point and cluster centers.

3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers..

4) Recalculate the new cluster center

5) Recalculate the distance between each data point and new obtained cluster centers.

6) If no data point was reassigned then stop, otherwise repeat from step 3).

2,119 total views, 3 views today