ML model deployment sections
Where are the models in SageMaker hosted?
in Docker containers registered in ECS
How to distribute Tensorflow across multiple machines?
using a framework called
Horovod or Parameter Servers
How to make containers compatible with SageMaker?
there is a library for that
run pip install sagemaker-containers
How is the structure of a training container?
/opt/ml
input/config:
hyperparameters.json
resourceConfig.json
input/data:
/
code:
python or any other code that does the training should be here
output:
output goes here
output/failure:
failures goes here
model:
inference codes are here
SageMaker on the edge
SageMaker Neo
ARM, Intel, Nvidia processors
avoid few hundred milliseconds of latency
Codes that Neo compile
Tensorflow MXNet PyTorch ONNX XGBoost
Neo comes with Compiler or Runtime?
comes with both
How to pair Neo with IoT Greengrass?
take a Ne-compiled model and deploy to an https endpoint
in pairing with IoT Greengrass:
Greengrass uses Lambda functions
Security in SageMaker
uses:
be careful with PII
How to keep data protected at rest in SageMaker
KMS
anything under
/opt/ml
and /tmp
can be encrypted using KMS
Securing Training Data
Can you encrypt inter-node communications?
yes you can
it can increase the time and dollar with DL
also known as inter-container traffic encryption
VPC with SageMaker
yes possible
also possible to cut internet from notebook
then need to set up vpc endpoint for s3
notebooks, training/inference containers are internet-enabled by default
SageMaker logging and monitoring
CloudWatch can log, monitor and alarm on:
CloudTrail records actions from users, roles, and services within SageMaker
- log files delivered to S3 for auditing
SageMaker with spot instances, does that work?
it could be interrupted but you can use S3 checkpointing to pick up where you left off
it also can increase time
Elastic Inference
to accelerates deep learning inferences
cheaper than a GPU instance
How to do EI?
deploy the model into a CPU with a EI
e.g. ml.eia1.large
can you do EI with notebooks?
yes you can
Where EI function?
where it is deep learning things
pre-build TF or MXNet containers
image classification
object detection
Can you do AS for inferences ?
yes you can
SageMaker and AZ?
SageMaker distribute across multiple AZ for better resiliency
VPC with min 2 subnets 1/AZ