4.1 Unsupervised Learning Flashcards

Question 1

Q

What does Machine learning mean?

Answer

A

Statistical methods to enable machines to learn and improve with data

Question 2

Q

What is the main difference between machine learning and traditional programming?

Answer

A

Inputs of classical programming:
- dataset and algorithm

Inputs of machine learning:
- dataset and “output”

output of machine learning
- algorithm

Question 3

Q

Name some machine learning applications in medical imaging (4)

Answer

A

Disease / object detection

Segmentation

Registration

Image generation

Question 4

Q

Describe the overall workflow of a typical ML algorithm in the training and inference (prediction) phase

Answer

A

Consider classification task:

Training Phase:
(iterative learning to find the best model)
1. input images w/ labels (benign / malignant)
2. Feature extraction
3. Feature vectors
4. machine learning algorithm

Prediction Phase:
(applying the model on new data)
1. new images
2. feature extraction
3. feature vectors
4. predicted labels

Question 5

Q

a) What is a feature?
b) What does feature extraction mean in medical image analysis?

Answer

A

a) set of attributes
ex. intensity, shape, texture

b) process of generating attributes
- converts raw image data into interpret-able and actionable info for machine learning

Question 6

Q

a) Define texture features. Give some examples.
b) Do texture features provide relative position information?

Answer

A

a) spatial distribution of grey levels over the pixels in an image
- used to measure first order statistics derived from first order histogram

ex.
- mean
- variance
- sd
- skewness (measure of asymmetry)
- kurtosis (measure of tailedness of distribution - how often outliers occur)
- measure of smoothness
- uniformity
- entropy

b) no info about the relative position of various grey levels within the image

Question 7

Q

a) What is feature normalization? Why is it needed?
b) Name some of the common feature normalization techniques

Answer

A

a) process of transforming each numeric input variable so its values lie on a comparable scale
- allows models to treat features more equitably during learning
- prevents bias in proximity measures
- prevents distortion of penalty effects and model interpretation

b)
Z-score normalizaiton
(substracts mean and divides by sd so feature has mean 0 and variance 1)

min-max normalization
(changes dynamic range)

linear scaling to unit range
(squash range AND keep linear structure)

softMax scaling
(squash range but does not keep linearity )

Question 8

Q

What are the three realms of ML?

Answer

A

Unsupervised learning (data-driven)
Supervised learning (task-driven)
Reinforcement learning (reward-driven)

Question 9

Q

What is the main difference between supervised and unsupervised learning?

Answer

A

Unsupervised: learn w/o labels

Supervised: learn w/ labels on training data

Question 10

Q

Name some applications of unsupervised learning in medical image analysis

Answer

A

image segmentation
(clustering)
Anomaly detection
disease detection ‘
(pattern recognition )
noise removal

Question 11

Q

a) What does clustering mean?
b) What are the clustering essentials

Answer

A

a) aggregate samples (unlabeled data) into groups
- membership to a group determined by similarity metric or distance

b)
1. Proximity measure
- similarity / dissimilarity (distance)

Criterion function
- evaluate clustering
clustering algorithm
- criterion function optimization

Question 12

Q

a) What does proximity measure do?
b) What does the criterion function do?
c) what are the common distance definitions?

Answer

A

a)
computes a numeric value to reflect how close two objects are in feature space
- small dissimilarity / large similarity –> points belong to same cluster

b)
(aka objective function)
evaluates quality of clusters by aggregating pairwise proximity or clusters statistics into a single score
- clustering algorithms want to optimize this function

c)
Euclidean distance
- straight-line distance in feature space
d(xa,xb) = (sum(xa^k - xb^k) ^2) ^(1/2)

Manhattan distance
- sum of absolute differences

d(xa,xb) = sum | xa^k - xb^k |

similarity:
sim(x1, x2) = 1 / [dist(x1,x2)]

Question 13

Q

a) How can a cluster be evaluated
b) What is measured for assessing the compactness of a cluster?
c) What is measured to evaluate cluster separation?

Answer

A

a) compactness and separation

b) Compactness: intra-cluster cohesion
- how near the cluster data points are to the cluster centroid
- sum of squared error

c) Separation: inter-cluster separation
- how separated different cluster centroids are w.r.t each other

Question 14

Q

a) How does k-means algorithm work?
b) What is the main objective of this algorithm?

Answer

A

a)
K-means clustering:
- partitional: groups data into K clusters (K is user defined)
- centroid-based: cluster has a center

b) Minimize the total variance within each cluster by updating cluster centorids

i. initalize centroids of random clusters
ii calculate data pts dist to centroids
iii. group objects based on min dist
iv. move centroids to minimize compactness

Question 15

Q

How can you optimize a k-means algorithm? What are the variables to change to improve the performance?

Answer

A

Change the stopping criterion from absolute “no change”
- each iteration may move the centroids slightly thus it is rare to have the model find the global min
restrict the number of iterations
threshold for the number of samples in each cluster
- delete if small
- split if large
intra-cluster distance vs inter-cluster distance

Question 16

Q

How do you define the optimum value for the number of clusters - describe the approach

Answer

Study These Flashcards

A

plot within-cluster sum of squares for different Ks
stop when adding cluster doesn’t add much value in terms of WCSS
look for the elbow point in the graph (where the slope goes from steep to shallow)

Question 17

Q

a) What is the concept of hierarchical clustering?
b) What does linkage mean?
c) What are the main categories of hierarchical clustering?
d) How do you interpret dendrograms?

Answer

Study These Flashcards

A

a)
Divide the dataset into a seq of nested partitions (ex. dendrogram - tree of clusters)
- work with any distance matrix cluster
- sensitive to outliers
- computationally expensive with large datasets

b)
dissimilarity between the pairs of observations (user defines linkage criterion)
- “samples that belong to the child cluster also belong to the parent cluster” –> small clusters are part of a big cluster

c)
single linkage:
- dist between closest points

centroid linkage:
- dist between cluster centroids

complete linkage:
- dist between furthers points

average linkage:
- distance between all points

d)
observations that fuse at:
bottom –> similar to each other
top –> diff from each other

Similarity of 2 observations based on location of vertical axis
- what is the height at which the branches containing the observations first fuse

Question 18

Q

What is the primary goal of the k-means algorithm during its iterative process?

Answer

Study These Flashcards

A

Minimize the total variance within each cluster by updating cluster centorids

4.1 Unsupervised Learning Flashcards

(18 cards)