7: Structure-Sequence Relationship Flashcards by sana z

How does protein sequence affect protein structure and function?

Protein sequence determines protein structure, which is linked to its protein function.

How well did you know this?

Not at all

Perfectly

What are the key concepts in sequence-structure relation?
1+3

Protein sequence determines structure
Unique fold: single lowest energy state
Stable fold: resistant to small environmental changes
Accessible fold: can reach stable fold normally

How well did you know this?

Not at all

Perfectly

Unique fold tests (3)

denatured and refolded into same fold
different conditions result in same fold.
> not dependant on environment
structure determination w/ various methods w/ various conditions results in same fold.

not dependant on environment
structure determination w/ various methods w/ various conditions results in same fold.

How well did you know this?

Not at all

Perfectly

Unique fold limitations (3)

Multi domain proteins hard to refold in vitro.
Beta amyloids (proteins unfold and aggregate to these instead).
1% of proteins switch fold upon stimuli.

How well did you know this?

Not at all

Perfectly

What is the energy landscape of protein folding like?

It resembles a rugged-surface funnel with shallow local minima.

How well did you know this?

Not at all

Perfectly

For accessible folds is energy consuming assistance required?

No, they should not be required to reach the lowest energy state.

but chaperones can increase the folding efficiency (with and without energy consumption) to avoid local minima.

How well did you know this?

Not at all

Perfectly

What is the correlation between sequence identity and protein folds?

Proteins with >20% sequence identity share folds.

provided from stability against mutations
exceptions can be made tho where 1 single AA change can result in fold change.

How well did you know this?

Not at all

Perfectly

Protein oligomerisation (4), multi domain structures, enzyme specificity are less correlated to sequence than principle domain folds?

True.
oligomers and multi domain interfaces typically only involve a small amount of all residues
= less dependant on sequence

and enzyme specificity can be determined by single AA
= mutations are bad

only at the domain level is the fold linked to its sequence.

How well did you know this?

Not at all

Perfectly

What factors are less correlated to sequence than domain folds?

Protein oligomerization and enzyme specificity.

How well did you know this?

Not at all

Perfectly

What physicochemical parameters can be predicted from single sequences?

Isoelectric Point (pI) and extinction coefficient.

weighted sum of all AA.

How well did you know this?

Not at all

Perfectly

What other predictions can be made from single sequences?

Linear motifs.*

Secondary structure.
based on tendency of AA to occur in specific secondary structures.
60% accurate.

short sequence patterns within proteins that act as functional signals e.g. kinase recognition site.

How well did you know this?

Not at all

Perfectly

Sequence known vs Structures known

1000x more sequences known than structures.

using seq. data and evolutionary relationships is important for seq-struc relationship analysis.

How well did you know this?

Not at all

Perfectly

Sequence similarity vs sequence identity w/ example

Sequence similarity measured often by physiochemical property.

Sequence identity w/ AA seq.

e.g. 30% similar with respect to AA hydrophobicity.

60% identical.

How well did you know this?

Not at all

Perfectly

What are the types of gene relationships in evolution?

Homolog genes: from common ancestor
Ortholog genes: separated by speciation
Paralog genes: from duplication events within species
Analog genes: similar functions, different origins

How well did you know this?

Not at all

Perfectly

Evolutionary events:

InDel,
Mutation.
= missense(=aa change), nonsense (=stop), silent (=no change).
Rearrangements: incl indel, inversions, duplication etc.

How well did you know this?

Not at all

Perfectly

What is the purpose of sequence alignments?

To match sequences by adjusting for evolutionary events.
> introducing indels.
optimal arrangement defined by residue-pair scoring and overall scoring.

How well did you know this?

Not at all

Perfectly

Identity scoring matrix pro/con

Study These Flashcards

+ easy to compute.
- doesnt account for aa similarity

Physiochemical scoring matrix pro/con

Study These Flashcards

+ good for struc/func analysis.
- inferior to other methods

What is the Needleman-Wunsch algorithm used for?

Study These Flashcards

Calculating pairwise GLOBAL sequence alignment.
incl gap penalties.

What is the Smith-Waterman algorithm used for?

Study These Flashcards

Calculating pairwise LOCAL sequence alignment.
start from highest value and work back.
no gap penalty.

Pairwise sequence alignments, pros cons

Study These Flashcards

+ mathematically defined optimal solution
- not feasible for multiple sequences (computing time inc exponentially)

What is the purpose of heuristic methods in sequence alignment?

Study These Flashcards

To perform faster multiple sequence alignments

What heuristic method is established for faster multiple sequence alignments?

Study These Flashcards

CLUSTAL

Calculate pairwise alignments
Build a phylogenetic tree
Perform progressive alignment based on the tree

How are large sets of sequences represented for progressive alignments/CLUSTAL?

Study These Flashcards

As Profiles
As linear Hidden Markov Models (HMMs)

What does a Markov chain describe? What is a characteristic of a linear Hidden Markov Model (HMM)?

A sequence of possible events with dependencies LHMM: Walks through the model in one direction

What advantages do profiles and HMMs provide?

Represent information of sets of aligned sequences Improve predictions based on multiple alignments

What are the benefits of using HMM alignments combined with known structural data?

extracting features from large sets of proteins >> single seq + combined with known data = Improved secondary structure prediction Prediction of disordered regions Prediction of domain boundaries Prediction of membrane protein topology

What are the steps in homology modeling?

Identify homologous template structures Align target sequence based on template structure Re-model insertions and deletions Refine and validate model > used to predict entire structure

At what percentage of sequence identity is homology modeling typically trustworthy?

Greater than 20% = high quality models > non homologous regions = harder to model e.g. loops/side chains

What is threading and What is the risk associated with threading for fold identification?

threading = identification without >20% identity templates available results in Globally wrong fold identification

what is and What does co-evolutionary analysis reveal about amino acids?

Correlation of mutations in pairs of amino acids = if X->B how likely is Y->C Direct information on protein fold and secondary structure

why not do more than pairs for co-evolutionary analysis?

higher order correlations contribute little in addition because 2nd order correlation between pairs is sufficient

Why is there correlation between pairs of amino acids?

often due to direct contact in 3D structure between amino acids that are far in sequence

What is required for statistical significance in co-evolutionary analysis?

A large number of aligned sequences

What does the output of Co-evolutionary analysis provide for modeling 3D structure?

Potential residue contacts + Visualization of secondary structural elements

What is the role of Deep Mind AlphaFold in protein structure prediction? Output?

Predicts protein structure using deep learning Trained on structures from the PDB Output: Inter-residue distances and torsional angles

How does AlphaFold 2 differ from its predecessor? output?

Uses a different neural network architecture Final output: 3D structure w/ iterative refinement

Alphafold multimer

predicting oligomeric complexes (2+ proteins) has high error rate

What are the properties of predictions made by AlphaFold? not super important

Accuracy decreases with fewer than 30 sequences Improvements in MSA depth lead to small gains pairwise residue distance accuracy predicted handles missing physical context trained to produce protein struc most likely to appear as paer of a PDB structure using whole sets is computationally challenged

What are the principles, accuracy, and limitations of homology modeling and threading?

Homology modeling: Trustworthy at >20% identity Threading: Higher risk of incorrect fold identification

What are the requirements and advantages of co-evolutionary analysis?

Requires large aligned sequences Provides insights into protein structure and interactions

7: Structure-Sequence Relationship Flashcards

(41 cards)