Developments in Algorithms for Sequence Alignment: A Review Flashcards

(21 cards)

1
Q

a bioinformatics method for finding and quantifying similarities between two biological sequences, like DNA or protein, by inserting gaps to optimize a scoring function that penalizes gaps and mismatches while rewarding matches.

A

Pairwise sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Aligns the entire length of two sequences (from start to end), even if parts don’t match well.

end-to-end alignment

A

Global alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Global alignment example

A

Needleman–Wunsch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Finds and aligns only the best matching region(s) within the sequences — useful when only a segment is similar.

find and align the similar local region

A

local alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

can be used to reduce the time and space cost incurred by dynamic programming. For this, the most widely applied method is to limit the state transition and conduct the alignment in a smaller search space.

do not guarantee that there will be no poor results, ideal alignments can be achieved in many types of software because the sequences to be aligned are usually quite similar.

A

Heuristic alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

local alignment example

A

Smith–Waterman

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

alignment is the basis of multiple sequence alignment and mainly divided into local alignment and global alignment

A

Pairwise sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

has become the basic algorithm that is used in many types of multiple sequence alignment software.

A

Needleman–Wunsch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Heuristic alignment example

A

BLAST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

heuristic methods

homologous segments (or seeds) are found and used as the “anchors” for the alignment. Each anchor point can divide the dynamic programming matrix into four sub-matrices located at the four corners. When more anchor points distributed throughout the sequences are found, the scale of the dynamic programming matrix can be greatly reduced, thereby reducing the time and space complexity

A

divide and conquer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Many alignment algorithms are designed to find such seeds to accelerate alignment, give 2 main ones

A

FASTA
BLAST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

is based on a heuristic idea: if two sequences have close similarity, then the number of gaps inserted in the sequences during alignment will be relatively small.

Therefore, the possible backtracking paths will be close to the diagonal of the dynamic programming matrix for similar sequences. The possible interval can be seen as a strip with certain width parallel to the diagonal

A

Bounded dynamic programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

core of a pairwise sequence alignment is the sum of the

A

scores of all aligned pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Multiple Sequence Alignment

is a heuristic method that aligns multiple sequences by the consistencies among pairwise alignments of each of the sequences to be aligned and a center sequence, which could be either generated or selected from the input

A

Star Alignment Strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Multiple Sequence Alignment

uses the star topology to simulate the relationship between sequences, and the selection of the center sequence determines the accuracy of the alignment to some extent.

A

Star Alignment Strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Multiple Sequence Alignment

another heuristic method, which performs multiple sequence alignment based on a pre-built guide tree (Figure 2B). Each leaf node in the guide tree represents a sequence, and each internal node represents an alignment of the sequences corresponding to its descendant leaf nodes (also known as profile).

A

Progressive Alignment Strategy

17
Q

The star alignment and the progressive alignment strategies have two main defects.

A

once a wrong gap is inserted in the alignment, it will continue to exist and affect all the subsequent alignment.

Although this rule causes the gap error to spread, it does not create a baseless error, and the root of the error lies in another defect

Local optima usually arise from the greedy principle. Specifically, the optimum selected for a particular sub-problem is not necessarily optimal for the overall problem, and this will adversely affect the final alignment.

17
Q

Defects of Heuristic Algorithms and Countermeasures

Gap penalties are already considered in the pairwise sequence alignment. The matching score of the two residues obtained by re-estimation with the consistency objective function is the score after the gaps have been penalized. Therefore, gap penalties need not be considered in the subsequent multiple sequence alignment process.

A

Consistency Objective Function

17
Q

Defects of Heuristic Algorithms and Countermeasures

processes the results of multiple sequence alignment to remove errors caused by the local minimum trap and the “once a gap, always a gap” rule. There are several ways to perform the iterative refinement, two of which will be introduced in this section.

A

Iterative Refinement

17
Q

software uses different algorithms, and therefore, the alignment quality can vary.

A

Multiple sequence alignment

17
Q

Quality Estimation of Multiple Sequence Alignment Software

A

Classical metrics