What are the 4 main structure-based methods?
What is docking?
Given a target site, the software takes each given virtual compound →explores its possible conformations (docking poses) within the target site →identifies those ones with the best predicted binding→ scored and ranked according to a mathematical evaluation (scoring function) of the predicted free energy change upon binding (docking score)
What are Molecular Dynamics (MD)?
Mainly for hit optimisation; allows to “relax” the
system over time→ consider the system flexibility and including solvent (water) molecules
What is De Novo?
Either user or computer specifies a starting atom or chemical group (seed) in the active/target site: new atoms or fragments are then added randomly onto the seed, to “fill” the site
What are the different methods to determine the 3D structures of proteins (targets)? (4)
o X-ray crystallography
o NMR spectroscopy
o Cryo-EM
o Homology modelling
–> Enable techniques such as: site analysis, molecular docking (NB virtual screening!), molecular dynamics, de novo drug design
What are the steps in X-Ray crystallography?
Different steps are required for target 3D-structure determination:
1-obtain the protein (extract from cells or let bacteria to produce for you)
2-crystallise the protein (obtain the crystal of your protein in the lab)
3-diffraction by X-rays (use a X-ray machine)
4-mathematical solution of X-ray diffraction pattern using a computer and final 3D- structure
What are the limitations of X-Ray crystallography?
Resolutions of X-Ray data: (for info)
> 3.5Å à cannot determine position of main chain (peptide backbone of the target) 2.5Å to 3.5Å à α-helices and β-sheets resolved, but the positions of loops is uncertain and solvent molecules are not seen
2.0Å to 2.5Å à orientations of amino acids side chains are resolved, and solvent molecules can be seen
<2.0Å à conformations of side chains and solvent molecules are resolved
What is NMR Spectroscopy?
What are the limitations of NMR Spectroscopy?
What is Cryo-EM: cryogenic electron microscopy?
Freezes biomolecule samples into a glassy state and probes them with beams of electrons→ generate a 3D-representation.
Stringing thousands of these snapshots together into stop-action movies and virtual reality flythroughs, we can watch biology in action.
The Protein Data Bank (Brookhaven Protein Database)
What does it contain?
What is Homology modelling?
When is it used?
Used when: primary sequence of a protein (its amino acids) of unknown structure shows good similarity to the sequence of a homologous protein (proteins that
have evolved from a common ancestor) of known structure
=A 3D-model of the target protein is created based on existing structural data: possible to get reliable models as structure is more conserved than amino acid sequence
Generating a homology model is a multi-step process:
What are the 5 steps?
1-template selection
2-sequence alignment
3-building the model
4-optimisation
5-validation
Homology modelling: template selection
Homology modelling works best for similar classes of proteins. Using a template not homologous is possible but the resulting model might not be that accurate:
the closest homologue available should be chosen
Homology: evaluated in terms of percentage primary sequence identity between target and template:
=At least 25% sequence identity needed 50% identity recommended for a good model
Homology modelling: sequence alignment
Sequences of reference structures (templates) are aligned to that of the target unknown structure. Free software/servers available for this step: arguably the most important one!
Structurally conserved regions (SCRs) are identified and used to start building the model…
Homology modelling: build the model
Sidechains need to be optimised, i.e. exploring the rotamers…
Homology modelling: optimising validating the model
The model can be energy minimised, if required, and then is subjected to a
series of geometry checks to spot macroscopic errors
(bond lengths and angles, steric clashes, etc.)
De Novo drug design
What is the limitation?
Does no consider the protein flexibility.
How are AI methods improving CADD? (6)
1-Prediction of protein folding: 3D structure prediction from primary sequence
(Alpha fold; D-I-Tasser)
2-Prediction of protein-protein interactions
3-More reliable VS and QSAR methods
4- Prediction of ADME/T properties
5-Drug repurposing
6-De novo drug design: generation of NCE with desirable properties, including synthetic feasibility
You need to find a new inhibitor for the viral protein X.
Unfortunately, no crystal structure for protein X is available, but
there is available the crystal structure for Xa, a homologous
protein from a different virus. The only issue is that Xa has been
crystallised WITHOUT a natural substrate bound (but you know its
binding site and its structure)
How would you use CADD in this project?
1) Build a homology model for CHIKV nsP2 (your X) using VEEV NS2 (Xa)
as template:
2) The model needed to be refined: as the natural ligand was not present in the
template, you manually add it to your model
3) Need to refine the model Molecular Dynamics
4) Use the binding site to run a docking virtual screening of a library of virtual
compounds
5) Select the best predicted binding molecule
6) Test in an antiviral assay