Protein Structural Bioinformatics Flashcards

Question 1

Q

SCOP

Answer

A

scop
structural classification of protein
first classification software

Question 2

Q

How can proteins be classified?

Answer

A

By secondary structure:
a-helical - Secondary structure exclusively or almost exclusively a-helical

Beta-sheet - Secondary structure exclusively or almost exclusively beta-sheet

a+B - a-helices and beta-sheets separated in different parts of the molecule, absence of beta-alpha-beta super secondary structure

a/B - Helices and sheets assembled from beta-alpha-beta units

a/B-linear - Line through centers of strands of sheets roughly linear

a/B-Barrels - Line through centres of strands of sheet roughly circular

Question 3

Q

The SCOP database

Answer

A

Structural classes:
- Folds
- Superfamilies
-Families

Small proteins:
- Cystine-knot cytokines
- Cystine-knot cytokines
- Transforming growth factor beta

Question 4

Q

What is used to define domains in proteins?

Answer

A

The Gō plot

Calculate radius of spherical volume of protein
Calculate disease from each alpha carbon of each amino acid to all the others
If the distance is greater than the spherical radius, score “+”

Question 5

Q

Disadvantages of the Gō method

Answer

A

Requires solved structure
Domain boundaries not always clear
Gō method now superseded by sequence-based algorithms

Question 6

Q

How Pfam builds domains

Answer

A

Start with a high quality protein structure (X-ray crystallography, good resolution, low Å)
BLAST PDB to find related protein structures
Align these – maximise structural homology (meaning adjust alignment so that boundaries of secondary structural elements match)
Build a statistical profile (Hidden Markov Model – HMM) of the “seed” alignment

Question 7

Q

How Pfam builds domains

Answer

A

Use the HMM to query GenPept – hmmsearch
Align the new hits to the HMM – hmmalign
Rebuild the HMM to include the new hits – hmmbuild
Repeat as desired, or until there are no new hits
“Structure, structure, structure” (Alex Bateman, founder of Pfam)

Question 8

Q

Disadvantages of Pfam

Answer

A

Each domain is defined by a HMM, and that HMM is only as good as the “seed” alignment used to construct it

Because the HMM building process is iterative, errors can be magnified

There are now so many domains in Pfam, that curation is uneven

Pfam was designed to support the Human Genome Project, and viruses were under-represented

Question 9

Q

Homstrad meaning

Answer

A

HOMologous STRucture Alignment Database

Question 10

Q

Not enough structures

Answer

A

Structure determination is far harder than sequencing

Illumina and Minion sequencing have made sequencing ultra high throughput

There is no equivalent technological leap forward for structural biology

The Structural Genomics Consortium

Question 11

Q

Protein Structural Bioinformatics Flashcards

(11 cards)