What is proof by contraposition?

What’s the intuition of the VC Dimension?
Even if you have infinitely many hypotheses in your hypothesis class, given some training sample, many of those hypotheses will look functionally the same wrt that specified sample
What is h? What does it do?
What is S?
the dataset
For the realizable setting, what are the relations between the bound and 1) epsilon, 2) abs(H)?
For the agnostic setting, what are the relations between the bound and
1) epsilon,
2) abs(H)
What is shattering?
The hypotheses in H can perfectly classify every possible labeling of S
What is the VC-dimension?
Def: The VC-dimension (or VaporikChervonenkis dimension) of ℋ is the cardinality of the largest set “ such that ℋ can shatter “.
Or the definition from the recitation. Get rid of one of these
VC dimension of a hypothesis space H is the maximum number of points such that there exists at least one arrangement of these points and a hypothesis h ∈ H that is consistent with any labeling of this arrangement of points
When is the VC dimension infinity?
If ℋ can shatter arbitrarily large finite sets, then the VC-dimension of ℋ is infinity
To prove that that VC(H) = some value M, what do you have to do?

What’s the vc dimension for separators in n dimensions?
n+1
What does ∃ mean?
There exists
(high level) What’s the corollary to Theorem 1 of the PAC theorem?
Give a numerical bound of the true error
What’s the key idea that makes Correlary 4 of theorem 1 of pac learning useful?

What are the practical ways we can tradeoff between low training error and keeping H simple? (1)
Use a regularizer
What are discriminative models?
What’s a pmf?
p(x) : Function giving the probability that discrete r.v. X takes value x.
What’s a pdf?
f(x) : Function the returns a nonnegative real indicating the relative likelihood that a continuous r.v. X falls in a certain interval
What’s a cdf? What’s the symbol?
F(x) : Function that returns the probability that a random variable X is less than or equal to x:
What does a beta distribution look like?

What does a dirichlet distribution look like?

What’s the symbol for expected value of x?
E[X]
What are the equations for expected value?

What’s the equation for variance (one that applies to both discrete and continuous)
