The diff between Parameters and Hyperparameters
Parameters are input values derived from training data/features where we use ML to discover relationships between that data.
Hyperparameters are values not derived from training features, used to configure training behavior
Four general steps in a Sweep Job workflow
sweep(...) function (don’t forget to call set_limits(...) to control how long the sweeps go for…)SS SM ET
Three things required for Hyperparameter Tuning
one synonymous with Classification, the other Regression…
Search Space:
- What they are
- The two types of values a hyperparameter could be
A Search Space is a set of values tried during the tuning process.
Types of Values:
- Discrete - the value exists in finite space. Synonymous with Classification (a specific label or range)
- Continuous - the value exists in infinite space along a scale. Synonoumous with Regression (finding a numeric value)
G R B
Configure a Sampling Method:
- What a Sweep Job needs one for
- The three types of Sampling
The values used in a Sweep Job depend on the sampling method used, which provides input values based on the sampling technique specified.
The three options for Sampling Method:
- Grid Sampling
- Random Sampling
- Bayesian Sampling
m_t and eet from autoML
Configure Early Termination means to stop a Sweep Job based on one of these two conditions.
When (and when NOT) to use an Early Termination Policy
Configure a Sweep Job to stop:
- After a maximum number of trials
- When new Models don’t produce significantly better results
When: Depending on your Search Space and Samplilng Method, Early Termination may be beneficial when working with Continuous Hyperparameters (meaning infinite possible combinations…you don’t want it to go on forever).
When NOT: Conversely, it may be unnecessary to use Early Termination when using Discrete Hyperparameters (limited dimensions == finite set of combinations).
Discrete Hyperparameters:
- How to use the Choice function
- What values types it can take
- Example code for using it in a Sweep Job
Choice() is a function from the ML Python SDK that select a random Choice from the given inputs.
It can take:
- csv: batch_size=Choice(values="16, 32, 64"),
- a range object: batch_size=Choice(range(10,20)),
- an arbitrary list object: batch_size=Choice(values=[16, 32, 64]),
Remember that a Sweep Job is just a Job configured to “sweep” , so we still need to create the Job instance:
from azure.ai.ml.sweep import Choice, Normal
command_job_for_sweep = job(
batch_size=Choice(values=[16, 32, 64]), # Discrete Hyperparameter
learning_rate=Normal(mu=10, sigma=3), # Continuous Hyperparameter
)Discrete Hyperparameters: Hyperparameters can be set to one of four other Discrete Distribution functions
Math: explain what the q parameter is
Four other Discrete Distro functions you can use:
- QUniform(min_value, max_value, q) - Returns a value like round(Uniform(min_value, max_value) / q) * q
- QLogUniform(min_value, max_value, q) - Returns a value like round(exp(Uniform(min_value, max_value)) / q) * q
- QNormal(mu, sigma, q) - Returns a value like round(Normal(mu, sigma) / q) * q
- QLogNormal(mu, sigma, q) - Returns a value like round(exp(Normal(mu, sigma)) / q) * q
q is the “limiting” parameter and what makes each of the above Discrete. Basically acts like a “step” function. So when distributing, you distribute by q-many steps.
Continuous Hyperparameters require one of these four methods for defining a Search Space
Four Continuous Distro functions you can use:
- Uniform(min_value, max_value) - uniform distro between min and max
- LogUniform(min_value, max_value) - a value drawn from exp(Uniform) so that the log of the return value is normally distributed
- Normal(mu, sigma) - a real value normally distributed with mean mu and a standard deviation sigma
- LogNormal(mu, sigma)- a value drawn from exp(Normal) so that the log of the return value is normally distributed