Unsupervised Learning and Probabilistic Inference Flashcards

(30 cards)

1
Q

In ML what do we start with

A

Observations (dataset) of a system or phenomenon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the aim when given a dataset in ML

A

To discover patterns which leads to discovering general knowledge that can be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a key limitation of learning from data

A

A model can only learn what is present in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What else limits a machine learning model beside data

A

The choice of model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What matters more than the quantity of data

A

The quality of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What questions should we asked about data quality

A

Is the data representative of the underlying phenomenon

Are the features sensible

Have we included all the features we need to

Do we have observations from all regions of the input space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do rows in the data matrix represent

A

Observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do columns in the data matrix represent

A

Features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the aim of unsupervised learning

A

Not to make predictions, but to understand and or gain insight into the phenomenon itself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does unsupervised learning try to find

A

Internal relationships and patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How are observations analysed in unsupervised learning

A

By comparing observations to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What key question does unsupervised learning ask abut data

A

Do we have different high level types of observations (clusters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is another key goal of unsupervised learning besides clustering

A

Dimensionality reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is dimensionality reduction

A

Representing the data matrix in a reduced form while preserving important info

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do we do dimensionality reduction

A

To make the dataset more efficient and easier to analyse without losing information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is K-Means clustering used for

A

To group observations into K clusters based on similarity

17
Q

What are the steps of K-Means clustering

A
  1. Choose the number of clusters (K)
  2. Select K initial centroids (starting points)
  3. Assign each point to the nearest centroid
  4. Update centroid positions based on the assigned points
  5. Repeat steps 3 and 4
18
Q

What does K-Means clustering stop

A

When convergence is reached (centroids stop changing significantly)

18
Q

How do you perform one full iteration of K-Means clustering

A

Assign each data point to the nearest centroid

recalculate each centroid as the mean of its assigned points

18
Q

What calculation is used when updating a centroid

A

The mean of all points assigned to that cluster

18
Q

How do you assign a point to a centroid in K-Means

A

Choose the centroid with the smallest distance to the point

19
Q

What is supervised learning

A

Comparing observations to distinguish them in terms of a target

20
Q

What is unsupervised learning in contrast

A

Learning without labels, aiming to get an insight into the dataset

20
Q

What is the aim of supervised learning

A

To distinguish between inputs such that this helps distinguish between targets

21
What is the key difference between supervised and unsupervised learning?
Supervised learning uses targets; unsupervised learning does not.
22
What is probability used for in ML?
To deal with uncertainty.
23
How is probability interpreted?
As a degree of belief.
24
What defines a probabilistic model?
The joint distribution.
25
What probability rules are important?
Sum rule, product rule, and conditional probability.
26
What is marginalisation?
Propagating belief or uncertainty by summing over variables.