what is orthogonalization
movement in 90 degrees from each other, basically changing one thing at a time
what could be done to improve the model if it doesn’t do well in the real world
perhaps we can change the cost function or change the dev set
what is precision
of examples that the classifier recognizes as cats what % are actually cats
what is recall
of all the images that are cats, what percentage where correctly recognized as cats
what is the recommendation when doing ML for evaluation metrics
just picking one, a single number evaluation metric
how can we combine precision and recall into one metric?
using F1 which is a type of average of precision and recall, which is harmonic mean
when do we check the precision and recall of an ML algorithm
when using the dev test
What is a satisficing metric
means that this metric satisfies a number and nothing else, if it satisfies, we do not care about the actual value
we will have an optimizing metric and others will be satisficing metrics
what are the first thing to determine in the ML algo.
we need to define dev set to be able to improve and also the single metric to optimize
also, if possible, determine the satisficing metrics
the dev set needs to have the same distribution of the test set
what can we do is we have a low error but the error present is extreme or inappropriate
We can add a weight to the error very high in which we give more error in those extreme cases
This would define the error metric again, so basically when this happens we just redefine the evaluation metric
Basically we can change the cost function to give way more loss if the cases are the extreme ones
what is the Bayes optimal error
it´s the theoretical level of maximum accuracy that can never be surpassed
what is the relationship between human performance and bayes optimal error
usually humans are very close to bayes optimal error
why is it important to know the human level performance on a task?
because, based on that performance we know when we should keep on optimizing a training or dev set, it will tell us if we want to improve bias or focus on improving variance
what is avoidable bias?
the difference between human error and the training error, this is avoidable error we can decrease using bias reduction techniques
how do we define human level error?
we need to think as a proxy for bayes error, that is the best best error we can ever have, then it will be the best human performance that exists for that task. For example, a team of experienced expert doctors
what do we do if we surpass human performance
it is not very clear as we’ll have an error lower than human, so keep on improving is not clear
the two fundamental assumptions of supervised learning
how to improve the performance of the model I have, what are the steps to take
You’ve handily beaten your competitor, and your system is now deployed in Peacetopia and is protecting the citizens from birds! But over the last few months, a new species of bird has been slowly migrating into the area, so the performance of your system slowly degrades because your data is being tested on a new type of data.
You have only 1,000 images of the new species of bird. The city expects a better system from you within the next 3 months.
The best next thing to do is to define a new evaluation metri (using a dev/test set) that accounts for the new species, and use that metric to guide further improvements.
Basically divide between dev and test and re tune
what are the training sets that need a similar distribution
dev and test, training and test can be a bit different
how to carry out error analysis
there are many things one could do to reduce the error.
Analyze the error and check it and see what the error is and how it is related to see then what to do, perhaps there is something related to all errors and then we can focus on that
Put all errors in a spreadsheet and analyze them thoroughly
what happens if the dataset has incorrectly labeled examples
if the error is random, nothing too big will happen, unless it is a systematic error
this can be analyzed also during the error analysis to see if the error comes due to incorrect labels
if we correct the labels of the dataset, where should we do it
in the dev and test set so they come from the same distribution
what to do if there are many things you could start with when creating a new algorightm
start with something quickly and then improve iteratively
then analyze error, bias and variance to prioritize the next step