Soft Clustering Flashcards

(6 cards)

1
Q

What is soft clustering and its charcteristics

A

Data points can belong to multiple clusters simultaneously, with a probability representing their ownership

Useful for overlapping or ambiguous data
Provides more nuanced results
More computationally intensive and harder to interpret than hard clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Soft K-means

A

Distance between each point in the cluster to the centroid * U

U is a continuous value between 0 and 1 giving the probability that the point belongs to the cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the U calculated to determine the probability that the point belongs to that cluster

A

β is called temperature
Small β (high temperature) are very soft, every cluster gets some weight
Large β (low temperature), approaching hard k-means, assignments become nearly 0/1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to calculate new centroid for Soft-K means

A

Sum all the distances * their weights and divide by the total weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the guassian mixture model

A

A probabilistic model that assumes data is generate from a mixture of guassian distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly