Machine Learning Questions Flashcards

(78 cards)

1
Q

Define machine learning.

A

A subset of artificial intelligence that enables systems to learn from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a model in machine learning?

A

A mathematical representation of a process that predicts outcomes from input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or false: Supervised learning uses labeled data.

A

TRUE

Supervised learning involves training a model on input-output pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank: Unsupervised learning finds patterns in ______ data.

A

unlabeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is overfitting?

A

When a model learns noise instead of the underlying pattern in the training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define training data.

A

The dataset used to train a machine learning model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does feature engineering involve?

A

Creating new input features from existing data to improve model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or false: Cross-validation helps prevent overfitting.

A

TRUE

Cross-validation assesses model performance on unseen data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank: Regression predicts a ______ variable.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a confusion matrix?

A

A table used to evaluate the performance of a classification model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define classification.

A

The task of predicting discrete labels for input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is gradient descent?

A

An optimization algorithm used to minimize the loss function in machine learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: Deep learning is a type of machine learning.

A

TRUE

Deep learning uses neural networks with many layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blank: Neural networks are inspired by the ______ of the human brain.

A

structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is bias in machine learning?

A

A systematic error introduced by approximating a real-world problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define hyperparameter.

A

A parameter whose value is set before the learning process begins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is regularization?

A

A technique to prevent overfitting by adding a penalty to the loss function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

True or false: Ensemble methods combine multiple models to improve performance.

A

TRUE

Examples include bagging and boosting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Fill in the blank: Support Vector Machines are used for ______ tasks.

A

classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is natural language processing?

A

A field of AI that focuses on the interaction between computers and human language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define reinforcement learning.

A

A type of learning where an agent learns by receiving rewards or penalties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is data preprocessing?

A

The process of cleaning and transforming raw data before analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

True or false: Clustering is a form of supervised learning.

A

FALSE

Clustering is an unsupervised learning technique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Fill in the blank: K-means is a popular ______ algorithm.

A

clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does **loss function** measure?
The difference between predicted and actual outcomes.
26
Define **bias-variance tradeoff**.
The balance between a model's complexity and its ability to generalize.
27
What is **transfer learning**?
Using a pre-trained model on a new but related task.
28
True or false: **Data augmentation** increases the size of the training dataset.
TRUE ## Footnote Techniques include rotation, scaling, and flipping images.
29
Fill in the blank: **Decision trees** split data based on ______ criteria.
feature
30
What is **feature selection**?
The process of selecting a subset of relevant features for model training.
31
Define **outlier**.
An observation that deviates significantly from the rest of the data.
32
What is **ensemble learning**?
Combining multiple models to produce a better predictive performance.
33
True or false: **Autoencoders** are used for unsupervised learning.
TRUE ## Footnote Autoencoders learn efficient representations of data.
34
Fill in the blank: **Principal Component Analysis** reduces dimensionality by transforming to ______ variables.
principal
35
What is **A/B testing**?
A method to compare two versions of a variable to determine which performs better.
36
Define **ROC curve**.
A graphical representation of a classifier's performance across different thresholds.
37
What is **precision** in classification?
The ratio of true positive predictions to the total predicted positives.
38
True or false: **Recall** measures the ability to identify all relevant instances.
TRUE ## Footnote Recall is also known as sensitivity.
39
Fill in the blank: **F1 score** is the harmonic mean of ______ and recall.
precision
40
What is **data leakage**?
When information from outside the training dataset is used to create the model.
41
Define **hyperparameter tuning**.
The process of optimizing hyperparameters to improve model performance.
42
What is **k-fold cross-validation**?
A method that divides data into k subsets to validate model performance.
43
True or false: **Batch learning** processes data in small increments.
FALSE ## Footnote Batch learning processes the entire dataset at once.
44
Fill in the blank: **Online learning** updates the model with ______ data.
streaming
45
What is **semantic segmentation**?
The process of classifying each pixel in an image into categories.
46
Define **data pipeline**.
A series of data processing steps to transform raw data into a usable format.
47
What is **model evaluation**?
The process of assessing a model's performance using various metrics.
48
True or false: **Feature scaling** is important for algorithms like k-NN.
TRUE ## Footnote Feature scaling ensures all features contribute equally.
49
Fill in the blank: **Bagging** reduces variance by training multiple models on ______ samples.
random
50
What is **boosting**?
An ensemble technique that combines weak learners to create a strong learner.
51
Define **learning rate**.
A hyperparameter that controls how much to change the model in response to the estimated error.
52
What is **dropout**?
A regularization technique that randomly drops units during training to prevent overfitting.
53
True or false: **LSTM** networks are used for sequence prediction.
TRUE ## Footnote LSTMs are a type of recurrent neural network.
54
Fill in the blank: **Convolutional Neural Networks** are primarily used for ______ tasks.
image
55
Define **batch normalization**.
A technique to normalize inputs of each layer to improve training speed and stability.
56
What is **gradient boosting**?
An ensemble technique that builds models sequentially to correct errors of prior models.
57
True or false: **Reinforcement learning** requires labeled data.
FALSE ## Footnote Reinforcement learning learns from rewards and penalties.
58
Fill in the blank: **Support Vector Machines** find the optimal ______ between classes.
hyperplane
59
What is **data augmentation**?
Techniques to artificially expand the size of a dataset by creating modified versions.
60
Define **self-supervised learning**.
A learning paradigm where the model generates its own labels from the input data.
61
What is **active learning**?
A machine learning approach where the model queries for labels on uncertain data.
62
True or false: **Explainable AI** aims to make AI decisions understandable.
TRUE ## Footnote Explainable AI helps users trust and interpret model predictions.
63
Fill in the blank: **Generative Adversarial Networks** consist of a generator and a ______.
discriminator
64
What is **semantic analysis**?
The process of understanding the meaning and context of words in text.
65
Define **feature importance**.
A measure of how much a feature contributes to the model's predictions.
66
What is **model deployment**?
The process of integrating a machine learning model into a production environment.
67
True or false: **Time series forecasting** predicts future values based on past data.
TRUE ## Footnote Time series analysis is crucial for financial predictions.
68
Fill in the blank: **Natural Language Processing** often uses ______ models for text analysis.
statistical
69
What is **data ethics**?
The study of moral issues related to data collection, usage, and privacy.
70
Define **cloud computing**.
The delivery of computing services over the internet, including storage and processing.
71
What is **big data**?
Extremely large datasets that may be analyzed computationally to reveal patterns.
72
True or false: **Data visualization** helps in understanding complex data.
TRUE ## Footnote Visualization techniques include charts, graphs, and maps.
73
Fill in the blank: **Data mining** involves discovering patterns in large ______.
datasets
74
What is **cloud storage**?
A model of computer data storage where data is stored on remote servers.
75
Define **data governance**.
The management of data availability, usability, integrity, and security.
76
What is **data wrangling**?
The process of cleaning and transforming raw data into a usable format.
77
True or false: **Data privacy** ensures individuals' personal information is protected.
TRUE ## Footnote Data privacy laws regulate how personal data is handled.
78
Fill in the blank: **Data architecture** defines the structure of ______ systems.
data