ML strategy Flashcards by Marcos Rojas

what is orthogonalization

movement in 90 degrees from each other, basically changing one thing at a time

How well did you know this?

Not at all

Perfectly

what could be done to improve the model if it doesn’t do well in the real world

perhaps we can change the cost function or change the dev set

How well did you know this?

Not at all

Perfectly

what is precision

of examples that the classifier recognizes as cats what % are actually cats

How well did you know this?

Not at all

Perfectly

what is recall

of all the images that are cats, what percentage where correctly recognized as cats

How well did you know this?

Not at all

Perfectly

what is the recommendation when doing ML for evaluation metrics

just picking one, a single number evaluation metric

How well did you know this?

Not at all

Perfectly

how can we combine precision and recall into one metric?

using F1 which is a type of average of precision and recall, which is harmonic mean

How well did you know this?

Not at all

Perfectly

when do we check the precision and recall of an ML algorithm

when using the dev test

How well did you know this?

Not at all

Perfectly

What is a satisficing metric

means that this metric satisfies a number and nothing else, if it satisfies, we do not care about the actual value

we will have an optimizing metric and others will be satisficing metrics

How well did you know this?

Not at all

Perfectly

what are the first thing to determine in the ML algo.

we need to define dev set to be able to improve and also the single metric to optimize
also, if possible, determine the satisficing metrics

the dev set needs to have the same distribution of the test set

How well did you know this?

Not at all

Perfectly

what can we do is we have a low error but the error present is extreme or inappropriate

We can add a weight to the error very high in which we give more error in those extreme cases
This would define the error metric again, so basically when this happens we just redefine the evaluation metric
Basically we can change the cost function to give way more loss if the cases are the extreme ones

How well did you know this?

Not at all

Perfectly

what is the Bayes optimal error

it´s the theoretical level of maximum accuracy that can never be surpassed

How well did you know this?

Not at all

Perfectly

what is the relationship between human performance and bayes optimal error

usually humans are very close to bayes optimal error

How well did you know this?

Not at all

Perfectly

why is it important to know the human level performance on a task?

because, based on that performance we know when we should keep on optimizing a training or dev set, it will tell us if we want to improve bias or focus on improving variance

How well did you know this?

Not at all

Perfectly

what is avoidable bias?

the difference between human error and the training error, this is avoidable error we can decrease using bias reduction techniques

How well did you know this?

Not at all

Perfectly

how do we define human level error?

we need to think as a proxy for bayes error, that is the best best error we can ever have, then it will be the best human performance that exists for that task. For example, a team of experienced expert doctors

How well did you know this?

Not at all

Perfectly

what do we do if we surpass human performance

it is not very clear as we’ll have an error lower than human, so keep on improving is not clear

How well did you know this?

Not at all

Perfectly

the two fundamental assumptions of supervised learning

you can fit the training set well
you can generalize to the dev/test set

How well did you know this?

Not at all

Perfectly

how to improve the performance of the model I have, what are the steps to take

Study These Flashcards

See if I have AVOIDABLE BIAS: bigger model, train longer/better optimization algorithms (momentum, adam, etc), NN architecture/hyperparameters search
then check VARIANCE and see how to improve that: more data, regularization (L2, data augmentation, dropout), NN architecture hyperparameter search

You’ve handily beaten your competitor, and your system is now deployed in Peacetopia and is protecting the citizens from birds! But over the last few months, a new species of bird has been slowly migrating into the area, so the performance of your system slowly degrades because your data is being tested on a new type of data.

You have only 1,000 images of the new species of bird. The city expects a better system from you within the next 3 months.

Study These Flashcards

The best next thing to do is to define a new evaluation metri (using a dev/test set) that accounts for the new species, and use that metric to guide further improvements.
Basically divide between dev and test and re tune

what are the training sets that need a similar distribution

Study These Flashcards

dev and test, training and test can be a bit different

how to carry out error analysis

Study These Flashcards

there are many things one could do to reduce the error.
Analyze the error and check it and see what the error is and how it is related to see then what to do, perhaps there is something related to all errors and then we can focus on that

Put all errors in a spreadsheet and analyze them thoroughly

what happens if the dataset has incorrectly labeled examples

Study These Flashcards

if the error is random, nothing too big will happen, unless it is a systematic error
this can be analyzed also during the error analysis to see if the error comes due to incorrect labels

if we correct the labels of the dataset, where should we do it

Study These Flashcards

in the dev and test set so they come from the same distribution

what to do if there are many things you could start with when creating a new algorightm

Study These Flashcards

start with something quickly and then improve iteratively
then analyze error, bias and variance to prioritize the next step

what to do when you have data from different sources but not much from the actual production distibution, what do you do

I divide the full dataset into dev and test, and add a bit to the testing set this makes sure I'm aiming for the goal

what happens in our conclusion of variance if the training data comes from a different distribution than the dev data

in this case we would have a problem of variance but the data does not come from the same place so perhaps the dev test is just much harder than the training set

what is the training dev set

a training set that comes from the same distribution of the training set, so with this one we can make clearer conclusions about variance this is another way to help with the issue when distributions are different in the sets

what is data mismatch

when we have a variance problem but because the data comes from different distributions for example there´s a difference in error in datasets that come from dev or train

how do we know we have a data mismatch problem

because we have a difference betweetn the dev and test sets in error and those come from a different distribution or simply difference between training and dev/test

how to address data mistmatch

carry out manual error analysis to try to understand difference between training and dev/test sets make training data more similar or collect more data similar to dev/test sets, perhaps even create data artificially

what is transfer learning

when a model has a good performance in something and then we transfer it to a similar task, for example, if it is good in recognizing cats, we can change the last layer to learn about radiology changes+ depending on how much data we have, we can retrain the whole NN in which we already initialized the weights and parameters and now we update them to fine tune them into the new tasks goals Also, we can add new layers to the model

when transfer learning makes sense

1) When both tasks have same input x, for example both are image recognition 2) You have a lot more data for task A and task B, every example in task B is more valuable 3) Low level features in task A can help in task B

what is multi task learning

we need to learn about different things at the same time, for example in cars identifying pedestrian, cars, stop signs, etc.

what is the difference we need to add to our NN when having a multitask learning situation

we need to change the loss function to account for this in which each output can have multiple labels, we go through many classes and see if the class is present or not

what is better performance, many NN doing different things or one being able to perform different tasks

it has better performance having one NN performing different tasks if the tasks are related to each other like recognizing imagesw

when does multi task learning makes sense

1) training on a set of tasks that could benefit from having shared lower-level features 2) usually amount of data you have for each task is quite similar 3) Can train a big enough neual network to do well on all tasks

what is end to end deep learning

when we have several tasks in tandem but we reeplace all the steps with one neural network

what is one of the challenges of end to end learning

we need a large dataset to make it work correctly also not in all cases it is important to make everything end to end, in some cases we'll have a better performance by doing some steps before

what can we do is we don't have much data for end to end learning

we can separate the tasks into smaller tasks in which we have more data

pros and cons of end to end deep learning

Pros: - let the data speak: if we already have the data X,Y let the data speak and figure it out - less hand designing of components needed: it will happen in one step Cons: - we need a lot of data for end to end learning - excludes potentially useful hand designed components: sometimes this will be helpful and useful for something later, we can inject human knowledge to make it work

if we correct the labels if the dev set, what should we do next

correct the labels of the test set

ML strategy Flashcards

(43 cards)