Sagemaker Flashcards

Question 1

Q

Training options

Answer

A

Built-in training algorithms
Spark MLLib
Custom Python Tensorflow/MXNet code
PyTorch, Scikit-Learn, RLEstimator
XGBoost, Hugging Face, Chainer
Your own Docker image
Algorithm purchased from AWS marketplace

Question 2

Q

Model deployment Options

Answer

A

Persistent endpoint for making predictions on demand
Sagemaker Batch Transform
Inference Pipeline for more complex processing
Sagemaker Neo for edge devices
Elastic Inference for accelerating deep learning models
Automatic Scaling(Horizontal scaling of no. end points)
Shadow Testing

Question 3

Q

Word2Vec models

Answer

A

Parameter update models that learn word embeddings vectors by training models on word predictions

Continuous Bag of Words(CBOW) - predict target word given surrounding context words, trying to maximise posterior log prob of predicting the right word, in the process updated weights are used as embedding vectors

Skip-gram - opposite of cbow, predict context words given target words

More details here
https://arxiv.org/abs/1411.2738

Question 4

Q

Blazing text

Answer

A

AWS implementation of Word2Vec & Text Classification algorithms. Optimized for Multi-core CPU

Not parallelizable

Modes
Single CPU: skip-gram, cbow, batch-skip-gram | supervised
Single GPU: skip-gram, cbow | supervised
Multi CPU: batch skip-gram

Input

Unsupervised:
text file with one training sentence per line, space-separated tokens

Supervised:
One sentence per line
The first “word” in the sentence is the string __label__<label></label>

Example:

__label__4 linux ready for prime time

__label__2 the cat jump over the lazy fox

Hyperparams:
Word2Vec:

Mode(skip-gram, batch-skip-gram, cbow)
Learning-rate
Window-size
Vector-dim
Negative-samples

Text classification:
- Epochs
- Learning_rate
- Word_ngrams
- Vector_dim

Question 5

Q

Sagemaker Flashcards

(5 cards)