How can models being built using SageMaker Notebooks be rapidly iterated?
Using SageMaker Local Mode to train models from the notebook, preventing the overhead from provisioning infrastructure and moving data
What is Amazon Ground Truth?
A service that uses humans (either in-house, specialists or Mechanical Turk) to label data and train the model
It uses this model to automatically label ‘easy’ cases, reducing training cost
What are .lst files?
Space separated files used to list data, such as images and their labels
Where can SageMaker algorithms be sourced?
They can be custom, from the Marketplace or provided
What the the main built-in SageMaker Algorithms?
K-Means - optimised for ‘web scale’
Latent Dirichlet Allocation (LDA) - perform text analysis and topic discovery
XGBoost - gradient boosted trees algorithm; used on tabular datasets
Where the do the assets for custom SageMaker Algorithms exist?
The code is hosted on ECR and the model itself etc. is on S3
Can you view the code for SageMaker Algorithms from the Marketplace?
No
What services does SageMaker support as data sources?
S3, EFS and FSx for Lustre
How are parts of the data in the dataset (i.e. train vs validation) managed?
With channels
How should failed training jobs be debugged?
However, don’t use the SageMaker Console
What are the general types of hyper parameters?
What technique does SageMaker Automatic Model Tuning use?
Bayesian optimisation
What are the general steps for performing hyper parameter tuning?
Which SageMaker tool is used for hyper parameter optimisation?
SageMaker Automatic Model Tuning
What are the steps to hosting a model with SageMaker?
What are some key considerations when managing SageMaker deployments?
What are the key considerations when securing SageMaker Notebooks?
Can you lock down access per SageMaker Notebook using IAM?
No
What are the key considerations when securing SageMaker models?