SCOP
scop
structural classification of protein
first classification software
How can proteins be classified?
By secondary structure:
a-helical - Secondary structure exclusively or almost exclusively a-helical
Beta-sheet - Secondary structure exclusively or almost exclusively beta-sheet
a+B - a-helices and beta-sheets separated in different parts of the molecule, absence of beta-alpha-beta super secondary structure
a/B - Helices and sheets assembled from beta-alpha-beta units
a/B-linear - Line through centers of strands of sheets roughly linear
a/B-Barrels - Line through centres of strands of sheet roughly circular
The SCOP database
Structural classes:
- Folds
- Superfamilies
-Families
Small proteins:
- Cystine-knot cytokines
- Cystine-knot cytokines
- Transforming growth factor beta
What is used to define domains in proteins?
The Gō plot
Disadvantages of the Gō method
How Pfam builds domains
How Pfam builds domains
Disadvantages of Pfam
Each domain is defined by a HMM, and that HMM is only as good as the “seed” alignment used to construct it
Because the HMM building process is iterative, errors can be magnified
There are now so many domains in Pfam, that curation is uneven
Pfam was designed to support the Human Genome Project, and viruses were under-represented
Homstrad meaning
HOMologous STRucture Alignment Database
Not enough structures
Structure determination is far harder than sequencing
Illumina and Minion sequencing have made sequencing ultra high throughput
There is no equivalent technological leap forward for structural biology
The Structural Genomics Consortium