7: Structure-Sequence Relationship Flashcards

(41 cards)

1
Q

How does protein sequence affect protein structure and function?

A

Protein sequence determines protein structure, which is linked to its protein function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key concepts in sequence-structure relation?
1+3

A

Protein sequence determines structure
Unique fold: single lowest energy state
Stable fold: resistant to small environmental changes
Accessible fold: can reach stable fold normally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unique fold tests (3)

A

denatured and refolded into same fold
different conditions result in same fold.
> not dependant on environment
structure determination w/ various methods w/ various conditions results in same fold.

not dependant on environment
structure determination w/ various methods w/ various conditions results in same fold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unique fold limitations (3)

A

Multi domain proteins hard to refold in vitro.
Beta amyloids (proteins unfold and aggregate to these instead).
1% of proteins switch fold upon stimuli.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the energy landscape of protein folding like?

A

It resembles a rugged-surface funnel with shallow local minima.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For accessible folds is energy consuming assistance required?

A

No, they should not be required to reach the lowest energy state.

but chaperones can increase the folding efficiency (with and without energy consumption) to avoid local minima.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the correlation between sequence identity and protein folds?

A

Proteins with >20% sequence identity share folds.

provided from stability against mutations
exceptions can be made tho where 1 single AA change can result in fold change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Protein oligomerisation (4), multi domain structures, enzyme specificity are less correlated to sequence than principle domain folds?

A

True.
oligomers and multi domain interfaces typically only involve a small amount of all residues
= less dependant on sequence

and enzyme specificity can be determined by single AA
= mutations are bad

only at the domain level is the fold linked to its sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What factors are less correlated to sequence than domain folds?

A

Protein oligomerization and enzyme specificity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What physicochemical parameters can be predicted from single sequences?

A

Isoelectric Point (pI) and extinction coefficient.

weighted sum of all AA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What other predictions can be made from single sequences?

A

Linear motifs.*

Secondary structure.
based on tendency of AA to occur in specific secondary structures.
60% accurate.

short sequence patterns within proteins that act as functional signals e.g. kinase recognition site.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sequence known vs Structures known

A

1000x more sequences known than structures.

using seq. data and evolutionary relationships is important for seq-struc relationship analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sequence similarity vs sequence identity w/ example

A

Sequence similarity measured often by physiochemical property.

Sequence identity w/ AA seq.

e.g. 30% similar with respect to AA hydrophobicity.

60% identical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the types of gene relationships in evolution?

A

Homolog genes: from common ancestor
Ortholog genes: separated by speciation
Paralog genes: from duplication events within species
Analog genes: similar functions, different origins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Evolutionary events:

A

InDel,
Mutation.
= missense(=aa change), nonsense (=stop), silent (=no change).
Rearrangements: incl indel, inversions, duplication etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of sequence alignments?

A

To match sequences by adjusting for evolutionary events.
> introducing indels.
optimal arrangement defined by residue-pair scoring and overall scoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Identity scoring matrix pro/con

A

+ easy to compute.
- doesnt account for aa similarity

18
Q

Physiochemical scoring matrix pro/con

A

+ good for struc/func analysis.
- inferior to other methods

19
Q

What is the Needleman-Wunsch algorithm used for?

A

Calculating pairwise GLOBAL sequence alignment.
incl gap penalties.

20
Q

What is the Smith-Waterman algorithm used for?

A

Calculating pairwise LOCAL sequence alignment.
start from highest value and work back.
no gap penalty.

21
Q

Pairwise sequence alignments, pros cons

A

+ mathematically defined optimal solution
- not feasible for multiple sequences (computing time inc exponentially)

22
Q

What is the purpose of heuristic methods in sequence alignment?

A

To perform faster multiple sequence alignments

23
Q

What heuristic method is established for faster multiple sequence alignments?

A

CLUSTAL

Calculate pairwise alignments
Build a phylogenetic tree
Perform progressive alignment based on the tree

24
Q

How are large sets of sequences represented for progressive alignments/CLUSTAL?

A

As Profiles
As linear Hidden Markov Models (HMMs)

25
What does a Markov chain describe? What is a characteristic of a linear Hidden Markov Model (HMM)?
A sequence of possible events with dependencies LHMM: Walks through the model in one direction
26
What advantages do profiles and HMMs provide?
Represent information of sets of aligned sequences Improve predictions based on multiple alignments
27
What are the benefits of using HMM alignments combined with known structural data?
extracting features from large sets of proteins >> single seq + combined with known data = Improved secondary structure prediction Prediction of disordered regions Prediction of domain boundaries Prediction of membrane protein topology
28
What are the steps in homology modeling?
Identify homologous template structures Align target sequence based on template structure Re-model insertions and deletions Refine and validate model > used to predict entire structure
29
At what percentage of sequence identity is homology modeling typically trustworthy?
Greater than 20% = high quality models > non homologous regions = harder to model e.g. loops/side chains
30
What is threading and What is the risk associated with threading for fold identification?
threading = identification without >20% identity templates available results in Globally wrong fold identification
31
what is and What does co-evolutionary analysis reveal about amino acids?
Correlation of mutations in pairs of amino acids = if X->B how likely is Y->C Direct information on protein fold and secondary structure
32
why not do more than pairs for co-evolutionary analysis?
higher order correlations contribute little in addition because 2nd order correlation between pairs is sufficient
33
Why is there correlation between pairs of amino acids?
often due to direct contact in 3D structure between amino acids that are far in sequence
34
What is required for statistical significance in co-evolutionary analysis?
A large number of aligned sequences
35
What does the output of Co-evolutionary analysis provide for modeling 3D structure?
Potential residue contacts + Visualization of secondary structural elements
36
What is the role of Deep Mind AlphaFold in protein structure prediction? Output?
Predicts protein structure using deep learning Trained on structures from the PDB Output: Inter-residue distances and torsional angles
37
How does AlphaFold 2 differ from its predecessor? output?
Uses a different neural network architecture Final output: 3D structure w/ iterative refinement
38
Alphafold multimer
predicting oligomeric complexes (2+ proteins) has high error rate
39
What are the properties of predictions made by AlphaFold? not super important
Accuracy decreases with fewer than 30 sequences Improvements in MSA depth lead to small gains pairwise residue distance accuracy predicted handles missing physical context trained to produce protein struc most likely to appear as paer of a PDB structure using whole sets is computationally challenged
40
What are the principles, accuracy, and limitations of homology modeling and threading?
Homology modeling: Trustworthy at >20% identity Threading: Higher risk of incorrect fold identification
41
What are the requirements and advantages of co-evolutionary analysis?
Requires large aligned sequences Provides insights into protein structure and interactions