SageMaker
SageMaker TL;DR?
Bunch of ML tools together to build and deploy ML models
SageMaker
What’s the ML model lifecycle?
Gather & prepare data, train the model, evaluate/test, deploy
SageMaker
SageMaker Studio?
IDE for whole ML model lifecycle
SageMaker
What is SageMaker Domain?
Group for a project, holds VPCs, EFS volumes, users, etc. all for a project
SageMaker
What are SageMaker Containers?
Docker containers deployed to EC2 that do all the ML heavy-lifting algorithm stuff on data
SageMaker
What is SageMaker Hosting?
important
Deploy endpoints for your models so other applications can invoke it directly
SageMaker
Cost structure for SageMaker?
Free, but costs for all the things that SageMaker launches (it’s complex)
Data Prep
What part of SageMaker handles preparing data for training?
SageMaker Data Wrangler
Data Prep
Can Data Wrangler handle image data?
Yes
Data Prep
What is a Feature?
From Feature Engineering, like turning date of birth into age-in-years
Data Prep
How does Data Wrangler handle Features?
Feature Store: store a feature definition and the transform to use; reuse this across sources
Deployment and Inference
Can SageMaker do async inference?
Yes: upload the large input to S3, call the SageMaker endpoint, tell it where the S3 object is
Deployment and Inference
Can async handle multiple requests?
No, one request per S3 object
Deployment and Inference
How can SageMaker handle multiple requests for inference?
SageMaker Batch: like async, but call the endpoint with a lot of S3 object locations
Deployment and Inference
Why have both async and batch?
Async is for near real-time response, it’s an interactive request, but uses S3 because of request size. Batch is true batch.
Tuning
What is AMT in SageMaker?
Automatic Model Tuning
Tuning
How does AMT work?
Define the objective Metric, AMT picks the hyperparameter ranges and lots of other things.