Supervised uses labeled data, unsupervised uses unlabeled data
helps explore and understand the underlying structure of a dataset,
making it easier to prepare data for further analysis and model building.
ensuring that all features contribute equally, improving the performance and reliability of the clustering algorithm, and preventing biases from features with larger scales.
No
pitfalls include noise and outliers, incorrect number of clusters
use elbow plot and silhouette score
data that deviates from the normal pattern in a dataset.
Unsupervised learning helps detect them through clustering, density-based, distance-based methods