What are artificial neural networks?
Artificial neural networks comprise many switching units (artificial neurons) that are connected according to a specific network architecture. The objective of an artificial neural network is to learn how to transform inputs into meaningful outputs.
Activation functions
What is a transformer?
Simple 2-layer neural network
Initially
How do we embed tokens?
How do we measure the similarity between two vectors?
How are word embeddings done?
Transformer encoder
Input vectors –> transformed vectors –> final outputs
Scaled dot-product attention
Scaled dot-product self-attention
Multi-head attention
Bidirectional Encoder Representations from Transformers (BERT)
BERT loss
Cross-entropy loss function
Assesses the probability of the correct word in the probability distribution that the network outputs. If the correct word has a high probability in the distribution you get a low loss value.
Different ways of using language models
Single-task supervised training
Train a transformer LM on labelled sequences to predict correct labels
Unsupervised pre-training + supervised fine-tuning
Train a transformer LM using BERT on unlabelled sequences. Then add a new output layer and continue training it on labelled sequences to predict correct labels.
Unsupervised pre-training + supervised training of small downstream classifiers
Train a transformer LM using BERT on unlabelled sequences then freeze the weights. Use the frozen LM outputs on labelled sequences to generate inputs to train a new model to predict labels.
Future: unsupervised pre-training at scale + prompting (few-shot learning)
Train a very large transformer LM autoregressively on very diverse data. Then try to find suitable prompts to induce it to predict correct labels.
Using a pre-trained protein language model
Correlated mutations in proteins
Predicting the 3D structure of proteins by amino acid co-evolution
What does AlphaFold2 do?
BERT for training AF