W5 Structure-based methods (CADD 2) Flashcards

Question 1

Q

What are the 4 main structure-based methods?

Answer

A

Docking
Grid-based methods
Molecular Dynamics (MD)
De novo

Question 2

Q

What is docking?

Answer

A

Given a target site, the software takes each given virtual compound →explores its possible conformations (docking poses) within the target site →identifies those ones with the best predicted binding→ scored and ranked according to a mathematical evaluation (scoring function) of the predicted free energy change upon binding (docking score)

Question 3

Q

What are Molecular Dynamics (MD)?

Answer

A

Mainly for hit optimisation; allows to “relax” the
system over time→ consider the system flexibility and including solvent (water) molecules

Question 4

Q

What is De Novo?

Answer

A

Either user or computer specifies a starting atom or chemical group (seed) in the active/target site: new atoms or fragments are then added randomly onto the seed, to “fill” the site

Question 5

Q

What are the different methods to determine the 3D structures of proteins (targets)? (4)

Answer

A

o X-ray crystallography
o NMR spectroscopy
o Cryo-EM
o Homology modelling

–> Enable techniques such as: site analysis, molecular docking (NB virtual screening!), molecular dynamics, de novo drug design

Question 6

Q

What are the steps in X-Ray crystallography?

Answer

A

Different steps are required for target 3D-structure determination:
1-obtain the protein (extract from cells or let bacteria to produce for you)
2-crystallise the protein (obtain the crystal of your protein in the lab)
3-diffraction by X-rays (use a X-ray machine)
4-mathematical solution of X-ray diffraction pattern using a computer and final 3D- structure

Question 7

Q

What are the limitations of X-Ray crystallography?

Answer

A

requires milligram quantities of proteins (big amount!)
gives the structure in the “solid state” (whilst most proteins are in solution!)
requires few months to solve structure
generally, less successful for receptors

Question 8

Q

Resolutions of X-Ray data: (for info)

Answer

A

> 3.5Å à cannot determine position of main chain (peptide backbone of the target) 2.5Å to 3.5Å à α-helices and β-sheets resolved, but the positions of loops is uncertain and solvent molecules are not seen
2.0Å to 2.5Å à orientations of amino acids side chains are resolved, and solvent molecules can be seen
<2.0Å à conformations of side chains and solvent molecules are resolved

Question 9

Q

What is NMR Spectroscopy?

Answer

A

Uses chemical shifts, coupling constants and interatomic interactions to determine the structure of the protein
Uses nuclear Overhauser effects (NOE) to calculate how close protons are in space
Solves structures in solution: closer to a physiological situation!

Question 10

Q

What are the limitations of NMR Spectroscopy?

Answer

A

requires milligram quantities of proteins (big amount!)
only applicable to relatively small proteins
takes a long time to solve the structure, as many NMR experiments have to be interpreted

Question 11

Q

What is Cryo-EM: cryogenic electron microscopy?

Answer

A

Freezes biomolecule samples into a glassy state and probes them with beams of electrons→ generate a 3D-representation.

Stringing thousands of these snapshots together into stop-action movies and virtual reality flythroughs, we can watch biology in action.

Question 12

Q

The Protein Data Bank (Brookhaven Protein Database)
What does it contain?

Answer

A

Contains X-ray, NMR and Cryo-EM structures of proteins and other biomacromolecules (>150000 structures):
In this archive, each catalogued structure is assigned to an identifying label (“PDB ID”),

Question 13

Q

What is Homology modelling?
When is it used?

Answer

A

Determining the structure of a protein by X-ray crystallography or NMR spectroscopy : time-consuming→ relatively few structures have been solved
Gene sequencing is very fast: many protein sequences have been established
Homology modelling: in silico methodology that allows to predict the structure of a protein based on its gene sequence.

Used when: primary sequence of a protein (its amino acids) of unknown structure shows good similarity to the sequence of a homologous protein (proteins that
have evolved from a common ancestor) of known structure

=A 3D-model of the target protein is created based on existing structural data: possible to get reliable models as structure is more conserved than amino acid sequence

Question 14

Q

Generating a homology model is a multi-step process:
What are the 5 steps?

Answer

A

1-template selection
2-sequence alignment
3-building the model
4-optimisation
5-validation

Question 15

Q

Homology modelling: template selection

Answer

A

Homology modelling works best for similar classes of proteins. Using a template not homologous is possible but the resulting model might not be that accurate:
the closest homologue available should be chosen

Homology: evaluated in terms of percentage primary sequence identity between target and template:

=At least 25% sequence identity needed 50% identity recommended for a good model

Question 16

Q

Homology modelling: sequence alignment

Answer

Study These Flashcards

A

Sequences of reference structures (templates) are aligned to that of the target unknown structure. Free software/servers available for this step: arguably the most important one!

Structurally conserved regions (SCRs) are identified and used to start building the model…

Question 17

Q

Homology modelling: build the model

Answer

Study These Flashcards

A

For structurally conserved regions (SCRs), the software transfer the corresponding coordinates of the template backbone and sidechains directly into the model.
For structurally variable regions (SVRs), 3D-backbone coordinates are kept from the template.

Sidechains need to be optimised, i.e. exploring the rotamers…

Available software also allows to build
missing loops

Question 18

Q

Homology modelling: optimising validating the model

Answer

Study These Flashcards

A

The model can be energy minimised, if required, and then is subjected to a
series of geometry checks to spot macroscopic errors
(bond lengths and angles, steric clashes, etc.)

Question 19

Q

De Novo drug design
What is the limitation?

Answer

Study These Flashcards

A

Does no consider the protein flexibility.

Question 20

Q

How are AI methods improving CADD? (6)

Answer

Study These Flashcards

A

1-Prediction of protein folding: 3D structure prediction from primary sequence
(Alpha fold; D-I-Tasser)
2-Prediction of protein-protein interactions
3-More reliable VS and QSAR methods
4- Prediction of ADME/T properties
5-Drug repurposing
6-De novo drug design: generation of NCE with desirable properties, including synthetic feasibility

Question 21

Q

You need to find a new inhibitor for the viral protein X.
Unfortunately, no crystal structure for protein X is available, but
there is available the crystal structure for Xa, a homologous
protein from a different virus. The only issue is that Xa has been
crystallised WITHOUT a natural substrate bound (but you know its
binding site and its structure)
How would you use CADD in this project?

Answer

Study These Flashcards

A

1) Build a homology model for CHIKV nsP2 (your X) using VEEV NS2 (Xa)
as template:
2) The model needed to be refined: as the natural ligand was not present in the
template, you manually add it to your model
3) Need to refine the model Molecular Dynamics
4) Use the binding site to run a docking virtual screening of a library of virtual
compounds
5) Select the best predicted binding molecule
6) Test in an antiviral assay

W5 Structure-based methods (CADD 2) Flashcards

(21 cards)