Concepts
What’s an n-gram?
1-gram is 1 word. 5-gram is 5 words, etc.
Concepts
What is Tokenization?
Convert raw text into sequence of tokens
Concepts
What happens to punctuation in Tokenization?
Kept. Likely each is a token
Concepts
Is a word just a token?
Not necessarily – some words can split into mulitple Tokens
Concepts
Example of a single word split into multiple tokens?
Richard might be “Rich” + “ard” because Rich and Richard are semantically close
Concepts
What is a Context Window?
Max tokens the model can consider at once.
Concepts
How does Context Window affect chatbots and prompting?
Limits how much context and question can be input to the model.
Concepts
How does Context Window affect image or video input?
Size of the image or video along with any text context or prompts
Concepts
What’s the relation between tokens and vectors in a vector DB?
A single token (like “cat”) corresponds to a single vector of many float values
Concepts
Why does a single token have a vector with many values?
Captures many types of semantic meaning, sentiment, syntactic role, …
Concepts
Example of how a vector is useful for searching a RAG?
Find similar tokens by looking for similar vectors
Concepts
What’s the techncial name for this similarity search between vectors?
k-nearest neighbor
Hyperparameters
What are hyperparameters?
Settings that define the model structure, algorithm and process
Hyperparameters
When do you set hyperparameters?
Have to be set before you tune – hyperparameters define how it learns
Hyperparameters
Example hyperparameters?
Learning rate, batch size, number of epochs
Hyperparameters
What do you get by tuning hyperparameters?
Reduce overfitting, improve accuracy
Hyperparameters
What is Learning Rate?
How large or small the steps are when updating weights during training
Hyperparameters
What happens if you set a small learning rate?
Takes a long time to converge, get accurate weights
Hyperparameters
What happens if you set a high learning rate?
Quicker to converge, but could over-shoot the right weights and not be accurate enough
Hyperparameters
What is Batch Size?
Number of examples used to update a model weight in one iteration
Hyperparameters
What happens with small batch sizes?
More stable learning, but takes more time to compute
Hyperparameters
What happens with large batch sizes?
Faster, but unstable – too much info can swing weights violently.
Hyperparameters
What is Number of Epochs?
How many times the model will iterate over the entire training set
Hyperparameters
What happens with too few Epochs?
Underfitting