How are protein constructs constructed and why
AI:
- Can make a predicated model using Alphafold 3, roseTTAfold, Boltz-1, chai-1
You do this to make a predicated modfied version of the protien that is able to crystallize better than the regular unmodified protien
If you used affinity rage to purify the protein you crystallized, do they need to be removed
Also if you want to design the nucleic acids in a protien-nucleic acid complex (length of the nucleic acids are important for crystallization
Describe alphafold
How is it trained
Predicts structure of protien from their sequences
Works better with 30-50 related seqeunces because residues that covary are close in the 3D structure:
- covary meaning have asp and arg next to each other and interacting, if asp mutated then arg changes to keep the interaction with asp in 3D
- having many sequences that are related can show patterns of covariation
Trained:
- it uses the many protein structures in databank and the protein sequences
- does multiple sequence alignment
Explain alphafold 3 process
Give it seqeunces ligands or covalent bonds (not just sequence)
It does:
- template search: looks at models
- genetic search: does MSA
- conformer generation: forms bits of the protien structure
What is alphafold pLDDT
Chatgpt doesn’t give this
The predicted local difference test :
- Gives a per-residue confidence score of 0-100 in the local structure
Catergories:
Darker blue:
- expected to be accurate (>90 backbone and side chain atoms are typically predicted well)
Light blue:
- expected to be to have good backbone predication (>70 backbone atoms are well predicted but some side chains are misplaced).
Yellow:
- low confidence
Orange:
- ribbon like appearance, 3D coordinated should not be interpreted, correlated with disorder
What is alpha fold predicated aligned error(PAE)
Measures the confidence of the relative position of two residues in the predicted model
Explain the PAE plot
ASSESING RELATIVE POSTION TO EACH OTHER
Y axis aligned residue: # of the AA residue in the predicted model that is also aligned in the true structure
X axis scored residue: the residue in the model that
Colour at position x,y shows the expected position error of the predicted residue x when compared to the predicted and aligned to the true residue y
Postion error: darker means less error
dark diagonal line:
- if x postion is correct in relation to y, the expected postion error is low and dark green along the diagonal line
Blocks:
- blocks of dark green represent domains of the structure
- block of light green are less well predicted like linker regions or loops
What are the limitations to the alphafold models
Only the protien is modelled, there is no information on the environment (like a membrane):
- unless you specifically include these things (like covalent modifications of protiens, add in DNA, RNA, ligands like ADP,NAD,ions)
- DNA it can do good, not RNA though
The information it’s give is only based off the data in the PDB and PDB has:
- only statics structures
- no disordered proteins
- includes both good and poor geometry
- AF3 is biased toward conformations that it has already seen (won’t tell us something new)
How can AF model be used as hyposthsis for crystallization
Want a homogenous protien, don’t want disorder that changes conformation in the protein, they also might not be packed into the crystal:
It can also crystallize individual domains:
- if don’t know the orientation of one domain in the protien in relation to the other
- use the PAE to select domain boundaries and make more than one truncation at each end of the domain
- but don’t want to cut where interactions are occuring (like h bond with backbone at a certain residue)
Predict the structure with PTMs and ligands:
- could help decide if you should add ligands or PTM for the crystallization
What are affinity tags in crystallization
Tag added to protien for purification usually at n or c term connected by a linker
Ex.
His6
Shorter tags are left on the protiens, longer tags are cleaved off via the cleavage site in the protien (ex. TEV)
example of nucleic acids helping crystallization
they bound ssDNA to the protien whcih was self complimentary and the 5’ end which allow the crystal to form easier and make a packing region (allows the protiens to co crystallize in one unit cell)