Supervised learning: concepts and applications Flashcards by Toby Beckett

What is machine learning

a set of methods that can automatically detect patterns in data

How well did you know this?

Not at all

Perfectly

What does machine learning do with these patterns

they are used to predict future data or to perform other kinds of decision making under uncertainty

How well did you know this?

Not at all

Perfectly

What is a key premise of machine learning

the learning problem

How well did you know this?

Not at all

Perfectly

what is the learning problem

learning from data is used in situations where we don’t have any analytic solutions but we do have data that can construct an empirical solution

How well did you know this?

Not at all

Perfectly

what does the learning problem use

a machine learning method

How well did you know this?

Not at all

Perfectly

What kind of inputs does a model have

feature, attribute, predictor, independent variable

How well did you know this?

Not at all

Perfectly

what kind out outputs does a model have

response, dependant variable, label growth

How well did you know this?

Not at all

Perfectly

When is regression used in supervised learning

where Y is continuous (quantitative)

How well did you know this?

Not at all

Perfectly

Where is classification used in supervised learning

covers situations where Y is categorical

How well did you know this?

Not at all

Perfectly

How do you minimise the least square error

the gradient decent method

How well did you know this?

Not at all

Perfectly

what are the main two types of supervised learning

regression and classification

How well did you know this?

Not at all

Perfectly

Give an example of regression

predicting house prices based on size, location and number of bedrooms

How well did you know this?

Not at all

Perfectly

Give an example of classification

spam/non-spam

How well did you know this?

Not at all

Perfectly

What is classification used for

assigning instance to discrete categories

How well did you know this?

Not at all

Perfectly

How is the “best fit” defined

the line that minimises the sum of squared errors between actual and predicted values

How well did you know this?

Not at all

Perfectly

What algorithms are supervised and can solve both regression and classification problems

Study These Flashcards

Decision trees
Random Forest
K-nearest neighbours

What are strengths of linear regression

Study These Flashcards

Simple and easy to implement
works well for linearly separable data
Computationally efficient

What are the limitations of linear regression

Study These Flashcards

Assumes linearity between features and the target variable
Sensitive to outliers
Struggles with multi collinearity
poor performance with complex, non-linear data

What is the process of the decision tree

Study These Flashcards

Determine attribute to select as the root (according to some goodness measure… information gain) two split on
Partition inout examples into subsets according to values order the root attribute
construct DT recursively for each subset
Connect the roots for the subtrees to the root of the whole tree via labelled links

What is entropy

Study These Flashcards

a measure of uncertainty or impurity of a dataset

What does information gain measure?

Study These Flashcards

A: The reduction in entropy after a dataset is split on an attribute.

Why Is information gain important

Study These Flashcards

It determines the best attribute to split on.

What are the strengths of the decision tree

Study These Flashcards

easy to visualise and interpret
handles non-linear relationships well
No need for data normalisation or scaling
can handle categorical and continuous data

What are the limitations of the decision tree

Study These Flashcards

Prone to overfitting (without running or max depth constraints)
Sensitive to small change in data (instability)

How to do random forest

1. Train each tree on a random subset of data with replacement (Bootstrapping). 2. Use a random subset of features at each split (Feature Subsampling). 3. Grow individual decision trees independently (Tree Building). 4. Combine results using majority vote (classification) or averaging (regression) – (Aggregation) 5. Output the aggregated result from all trees (Final Prediction).

what is bootstrapping

Each decision tree is trained on a random sample of the dataset: - Sampling is with replacement - So some data points appear multiple times - Some data points are left out

Example in random forest

each trees uses a different ML method to make predictions, the majority vote by each model (tree) is then combined to find the majority and the output is the aggregated result from all trees

random forest definition

A Random Forest trains many different decision trees, lets each one make a prediction, and then combines their answers to get a more reliable final result.

What are the strengths of random forest

- Reduces overfitting compared to individual decision trees as it only looks a t subset of the data - Handles high dimensional data effectively

What are the limitations of random forest

- Less interpretable than individual trees - Computationally intensive for large datasets - May require tuning for hyper parameters (e.g. number of trees)

What is the nearest neighbour classification

1. user inputs k: an integer representing the number of nearest neighbours (instances) to search for 2. with each unlabelled instance: calculate the distance between it and all the instances in the data set 3. 4. 5. find the k nearest neighbours count the assigned class labels in k nearest neighbour for each class the class with the highest count (majority vote) is the output

What are the strengths of k-nearest neighbours

- Simple to implement and understand - Makes no assumptions about data distribution - effective for non-linear and multi-class problems - adaptive to changes in the dataset

What are the limitations of K-nearest Neighbours

- Computationally expensive during prediction (lazy learning) - Performance depends heavily on the choice of k and distance metric - sensitive to irrelevant or noisy features

What are the key issues of nearest neighbour classification

- no model is built! all the data is retained - training could take up to O(np) per observation - will not do well when number of features is large

Supervised learning: concepts and applications Flashcards

(34 cards)