What does the superscript within parentheses mean?
e.g. X(i)
The ith row of X
What does the symbol ∀ mean?
“for all” or “for any”
In a joint probability distribution table, how many rows are there?
One for each possible combination of variable values
Given a joint probability distribution, what can we calculate?
Conditional or joint probabilities over any subset of the variables
What’s the conditional probability equation?

What does high bias correspond to? (2)
High bias—-> underfitting —–> More train set error
What does high variance correspond to?
High variance —–> overfitting —–> More dev set error and more test set error
What is the development set (aka dev set)?
It’s another term for “validation set”
What’s a useful way to think of bias?
how well does my model fit the training data?
What’s a useful way to think of variance?
how well does my model generalize to unseen data sets?
Can you have high bias and high variance?
Yes
Given training error and validation error, how can you assess the bias and variance?
Why can’t we use the validation set for testing performance?
What models can we use higher-order features for?
Non-exhaustive list:
What defines a decision boundary?
The hypothesis h and the parameters
What do we know about the update rule for logistic regression?
Looks very similar to the update rule for linear regression, except the hypothesis is different
Does feature scaling work for logistic regression?
Yes
How does multiclass logistic regression work?