interdisciplinary
field that combines biology, computer science,
statistics, mathematics, and engineering to analyze
and interpret biological data, particularly data from
large datasets like genomes or protein sequences
Bioinformatics
It is a widely-used format for
representing nucleotide or protein sequences.
FASTA
It consists of a header line starting with ‘>’, followed by the sequence data on subsequent lines.
FASTA
in sequence alignment, a ________ represents a position where one sequence has an insertion or
deletion relative to another sequence.
Gap
____________ are
introduced to optimize alignment and account for
evolutionary changes
Gap
___________ are
introduced to optimize alignment and account for
evolutionary changes.
Gap
It is the
sequence for which you are searching for similarities
or matches within a database
Query sequence
It’s the sequence you
are using as a reference
Query sequence
it is the
sequence(s) in a database against which the query
sequence is compared during sequence alignment or
similarity searches
Subject sequence
it is a branching
diagram that depicts the evolutionary relationships
among a set of organisms, genes, or species
Phylogenetic tree
It
shows the inferred evolutionary history and
relatedness based on genetic or sequence data
Phylogenetic tree
it is a
unique numerical identifier assigned to each
sequence entry in the NCBI (National Center for
Biotechnology Information) databases.
GI number
It provides a
stable and unique way to refer to a specific sequence
entry.
GI number
It is a
unique identifier assigned to a sequence record in a
public sequence database (like GenBank, EMBL, or
DDBJ)
Accession number
Typically consist of letters
and numbers and are used to reference specific
sequence entries.
Accession number
Involves
identifying and labeling the features of a genome such as genes, regulatory sequences, and other
functional elements.
Genome annotation
This process helps in
understanding the biological significance of the DNA
sequence.
Genome annotation
In sequence alignment or similarity searches, it is a numerical value that quantifies the level
of similarity or quality of alignment between two
sequences.
Score
Higher scores generally indicate more
significant similarity.(T or F)
TRUE
It is a statistical
measure that estimates the number of different
alignments with scores equivalent to or better than a
given score that would occur by chance in a database
search.
Expect value (E-value)
A ___________ indicates a more significant
match or similarity.
lower E-value
A field which uses computers to store and analyze
molecular biological information
BIOINFORMATICS
It is about finding and interpreting biological data
online
BIOINFORMATICS
It is a field in which biology, mathematics, statistics, computer
science, information technology, and other health sciences are
merged into a single discipline to process biological data
BIOINFORMATICS