What is the clustering method k-means used for?
number of clusters k is chosen in advance, then the goal is to partition the inputs into sets Si…Sk in a
way that minimizes the total sum of squared distances from each point to the mean of its assigned cluster.
How to choose k?
Plotting the sum of squared errors and looking at where the graph “bends”