Clustering Flashcards

(15 cards)

1
Q

What type of learning is K means?

A

Unsupervised learning where the data has no labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of K means clustering?

A

To group data points into k clusters based on similarity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does k represent in K means?

A

The number of clusters you want the algorithm to find.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does K means start the clustering process?

A

It randomly selects k data points as initial cluster centres.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a centroid?

A

The mean of the data points in a cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does K means assign points to clusters?

A

It assigns each point to the cluster whose centroid it is closest to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When does the K means algorithm stop?

A

When the centroids no longer change between iterations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do points move between clusters?

A

Because the centroids get recalculated and new distances change the assignment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can K means predict cluster membership for new data?

A

Yes, by checking which centroid the new point is closest to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why should K means forecasts not be fully relied on?

A

Because new data changes the cluster means, which affects predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the elbow method?

A

A way to choose k by looking at where the WSS curve starts to flatten.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is WSS?

A

Within sum of squares, a measure of how tight the clusters are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Give one workplace use for clustering in FPA.

A

Grouping customers or spend patterns to find behaviour trends.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is clustering useful with multi dimensional data?

A

Because it groups points across many different measures at once.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why did you use Python instead of R?

A

IT could not approve R, and Python is already available and works for clustering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly