What pets in X ray crystallography are in real space and which are in recip space
Recip space:
- data collections
- intensities and phase
- improved phase
- recip space error removal
- calculated phases
- original or previous phases
Real space:
- electron density map
- improved map
- molecular model
- improved model
- Real space error removal
What is the ideal flow for X ray crystallography
Data collection then getting intensities and phases
Then getting the electron density map then molecular model
Is this case the phases would be perfect and the model is perfectly fitted in the electron density map
What happens to phases before a model is built
Getting from electron density map to improved e density map
The phases are improved using information that is known about the protien crystals so that the e density map is better
These phases don’t have to be perfect
What two thing do we make use of the help phase improvement
Solvent flattening:
- solvent is in certain parts of the crystals unit cells surrounding g the protien
- e density in region of solvent is less than electron density in protien region (because solvent is disordered has low electron bc no atoms, protien more ordered)
- basically removing the noise/electron density from solvent to enhance protien e density
NCS:
- more than one copy of the polypeptide chain in the asymmetric unit of the crystal which looks similar to each other
- their electron densities are similar bc they are related by non crystallographic symmetry
- ex. 3 fold axis in a local symmetry axis but this symmetry doesn’t apply to the whole crystal. Avg the same e densities in that local symmetry axis. The noise is random distributed, so if avg you get a better mean
What types of process is phase improvement
An iterative process (done gradually) so the electron density will improve in each cycle
After you get the improved map what happens
A FT to get the calculated phases then combine the calculated phases from the improved map with the original phases estimates
Take the improved phases to get new electron density map
What to consider when building the model
Polypeptide backbone:
- this would be shown by continous electron density
Side chains:
- every 3.8 A there should be electron density that branch out and the c alpha are at these branching points (best AA for fitting are trp and phe)
Skeletonize:
- Computer programs look for a path through the electron density the follows the backbone of the protien
- the backbone is fitted along this skeleton
Carbonyl groups:
- is resolution is around 2.8A there are bumps in the electron density that represent carbonyl groups
- this shows the direction of the chain (N-C or C-N)
Chain direction:
- can come from the carbonyl
- also from the sequence of the protien (already known, phe phe ala is big big small e density) and the way the side chains is alpha helices hang (
What is the refinement step
After fitting the model want to get the improved model
The refinement is done in recip space but the output (improved model) is in real space
What is crystollgraphic refinement
What are the parameter that are being refined
To minimize the differences between the observed (oberserved from data) and calculated amplitudes (calculated from our model)
For each atom j:
- the atomic postion (x,y,z)
- the temp factor/b-factor (diff temp factor through the residue because lysine can flop around further from back bone)
- occupancy (G, always 1 except atoms in a ligand (ligand didn’t soak into every protien in crystal) or atoms in alternate conformations (in two diff places at once)
Whatdetermines how much unique data can be measured
What’s so good about more data, what is the rule of thumb
The resolution of the diffraction data (how far it’s doffracting) and the volume of the unit cell
The more data, the better the refinement behaves. For example fitting a line with few data points is harder
Rule is to have 10 data for the refinement of each parameter (atomic postion, occupancy, etc.)
Explain the assumption that we make about the content of the crystal
These assumptions can help us calculate the number of data that can be measured at different resolutions with different parameters
Ex. Assume the crystal has 50% solvent content:
At 3.5 A the number of observations/number of parameters (xj,yj,zj, Bj) is 0.5 but we wanted 10 so we never do as well as we want to
What is the effect of resolution on electron density maps
What is special about the carbon carbon distance
At lower resolutions the density is more blobby, there is increased uncertainty in the positions of the atoms at lower resolutions
1.5A, don’t resolve the carbon atoms that are 1.5 A apart (don’t get separate electron density for it) so atomic model does t actually mean invidual Carbon carbon bonds
What are geometric restraints
Becuase small molecule crystals diffract really well and at high resolution, this gives ideas about the expected values for bond lengths and bond angles, which are available in databases
In analysis These values are included as restraints, not constraints, when you refine other structures :
- we want it to be close to that value, but not fixed at that value
For ex.
- some geometric constraints for isoleucine are that the carbon carbon distance has to be 1.5
- you get a standard deviation for that length (range) and how much that value is weighted in the restraint
- A higher standard deviation means it’s weighted lower and it’s not as tightly restrained
What is the dihedral angle for a peptide bond
Trans: 180
Cis: 0
In geometric restraints Whag is the van der waals repulsion’s term
To make is so atoms are not in The same electron density and are pushing away from each other
Where does refinement happen
From the molecular model to the improved model
Explain how data is removed from the refinement
Have the R factor, and r free
R free has 10% of the data randomly removed from the refinement/equations (at least 1000 reflections)
It’s used to monitor the refinement:
- expect r factor to go down if minimizing the difference in the equation and improving the model
- this gives us an idea of if we’re overfitting just to try to minimize the differences
- ex. If adding too much water molecules to overfit, the R factor would go down but Rfree wouldn’t
How do you judge the model wuality once refined and got improved model
Pro check
Whatcheck
Molprobity
What is factor into judging the models wuality
The IUPAC naming conventions for the atoms :
- ex. In asp the O delta 1 and 2
geometry:
- bond angles, length, planarity (phe), chirlaity
Dihedral angles:
- psi and phi from ramachandran plot
- torsion angles of the side chains chi 1234
- omega (peptide bond)
Non bonded interactions:
- have to have suitable distances
What are dihedral angles special for
What do good wuality ramachandran look like
What does it look like for pre proline
Good indicators of model quality because they are generally not restrained
Most residues in most favourable (red) and allowed (yellow) regions Becuase correctly modelled
More resticted for proline Becuase of the ring in the backbone which restrains that angles for n-calpha
More restricted angles for pre proline (residue before proline) becaus its attach to the n in the backbone so more restriction