Bioinformatics Flashcards

(116 cards)

1
Q

interdisciplinary
field that combines biology, computer science,
statistics, mathematics, and engineering to analyze
and interpret biological data, particularly data from
large datasets like genomes or protein sequences

A

Bioinformatics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

It is a widely-used format for
representing nucleotide or protein sequences.

A

FASTA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

It consists of a header line starting with ‘>’, followed by the sequence data on subsequent lines.

A

FASTA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in sequence alignment, a ________ represents a position where one sequence has an insertion or
deletion relative to another sequence.

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

____________ are
introduced to optimize alignment and account for
evolutionary changes

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

___________ are
introduced to optimize alignment and account for
evolutionary changes.

A

Gap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

It is the
sequence for which you are searching for similarities
or matches within a database

A

Query sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

It’s the sequence you
are using as a reference

A

Query sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

it is the
sequence(s) in a database against which the query
sequence is compared during sequence alignment or
similarity searches

A

Subject sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

it is a branching
diagram that depicts the evolutionary relationships
among a set of organisms, genes, or species

A

Phylogenetic tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

It
shows the inferred evolutionary history and
relatedness based on genetic or sequence data

A

Phylogenetic tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

it is a
unique numerical identifier assigned to each
sequence entry in the NCBI (National Center for
Biotechnology Information) databases.

A

GI number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It provides a
stable and unique way to refer to a specific sequence
entry.

A

GI number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

It is a
unique identifier assigned to a sequence record in a
public sequence database (like GenBank, EMBL, or
DDBJ)

A

Accession number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Typically consist of letters
and numbers and are used to reference specific
sequence entries.

A

Accession number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Involves
identifying and labeling the features of a genome such as genes, regulatory sequences, and other
functional elements.

A

Genome annotation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

This process helps in
understanding the biological significance of the DNA
sequence.

A

Genome annotation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In sequence alignment or similarity searches, it is a numerical value that quantifies the level
of similarity or quality of alignment between two
sequences.

A

Score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Higher scores generally indicate more
significant similarity.(T or F)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

It is a statistical
measure that estimates the number of different
alignments with scores equivalent to or better than a
given score that would occur by chance in a database
search.

A

Expect value (E-value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A ___________ indicates a more significant
match or similarity.

A

lower E-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A field which uses computers to store and analyze
molecular biological information

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

It is about finding and interpreting biological data
online

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

It is a field in which biology, mathematics, statistics, computer
science, information technology, and other health sciences are
merged into a single discipline to process biological data

A

BIOINFORMATICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
It uses complex machines to read biological data at a much faster rate than before.
BIOINFORMATICS
26
There is a marriage between biology and informatics. (T or F)
TRUE
27
The science of collecting and analyzing complex biological data
BIOINFORMATICS
28
Allows the storage and management of large biological data sets
THE CREATION OF DATABASES
29
Data is being generated at a much greater pace than its analysis (e.g. Human Genome Project)
THE CREATION OF DATABASES
30
These are repositories so it's like a bank of biologic information and are designed to collect, archive, visualize, and organize biologic data.
Databases
31
This is to enable scientists to have an intelligent data description, interpretation, or retrieval.
Databases
32
There is much data that has been generated especially since the completion of the
Human Genome Project
33
When was Human Genome Project launched?
1990s
34
Objective of human genome project
To sequence the entire human genome which consists of about 3.2 billion base pairs.
35
It was completed in 2003 because of this there’s a large amount of data that have to be interpreted or analyzed.
Human Genome Project
36
Aside from the human genome, many other organisms were completely sequenced. So there is again an enormous amount of data that has to be understood that is why databases have been created. (T or F)
TRUE
37
PRINCIPAL COMPONENTS OF BIOINFORMATICS
*THE CREATION OF DATABASES *THE DEVELOPMENT OF ALGORITHMS AND STATISTICS *THE USE OF THESE TOOLS FOR THE ANALYSIS AND INTERPRETATION OF VARIOUS TYPES OF BIOLOGICAL DATA
38
Determine relationships among members of large data sets
THE DEVELOPMENT OF ALGORITHMS AND STATISTICS
39
The large set of data are organized so that relationships can be determined that is called
Algorithm
40
Algorithm is applied in ________
Statistics
41
including DNA, RNA and protein sequences, protein structures, gene expression profiles, and biochemical pathways
THE USE OF THESE TOOLS FOR THE ANALYSIS AND INTERPRETATION OF VARIOUS TYPES OF BIOLOGICAL DATA
42
Sciences that attempt to describe a living organism in terms of 'omics'
BRANCHES OF BIOINFORMATICS
43
BRANCHES OF BIOINFORMATICS
Genomics Transcriptomics Proteomics Microbiomics Metabolomics
44
IDENTIFY THE BRANCH OF BIOINFORMATICS - involves the description of sequences of the entire genome of an organism
Genomics
45
IDENTIFY THE BRANCH OF BIOINFORMATICS study of all RNA molecules in a living organism
Transcriptomics
46
IDENTIFY THE BRANCH OF BIOINFORMATICS the description of the entire complement of proteins in a living organism.
Proteomics
47
IDENTIFY THE BRANCH OF BIOINFORMATICS They study the sequence, 3D structures, and other properties of proteins.
Proteomics
48
IDENTIFY THE BRANCH OF BIOINFORMATICS It is the entire proteins found in a living organism.
Proteomics
49
IDENTIFY THE BRANCH OF BIOINFORMATICS Pertains to microbes, viruses, fungi, parasites, bacteria.
Microbiomics
50
IDENTIFY THE BRANCH OF BIOINFORMATICS The genomes of these microorganisms are described within a specific environmental niche
Microbiomics
51
IDENTIFY THE BRANCH OF BIOINFORMATICS involves description of the chemical processes involving metabolites.
Metabolomics
52
DNA/RNA BIOINFORMATICS APPLICATIONS
● Retrieving DNA sequences from databases ● Computing nucleotide compositions ● Identifying restriction sites ● Designing polymerase chain-reaction (PCR) primers ● Identifying open reading frames (ORFs). ● Predicting elements of DNA/RNA secondary structure ● Finding repeats ● Computing the optimal alignment between two or more DNA sequences ● Finding polymorphic sites in genes (single nucleotide polymorphisms, SNPs) ● Assembling sequence fragments
53
Identifying open reading frames (ORFs) - Open reading frames means that you have a sequence which includes the
start codon until a stop codon
54
WHY DO BIOINFORMATICS?
● It serves to save time when doing real experiments. design primers ● You might want to do a simulated experiment on a computer (' in silico') instead of a real environment.
55
Bioinformatics is very convenient for a scientist because it serves to
Save him time when he wants to do a real experiment. As the experiment or the research study may start by simulating it in a computer first.
56
When you do simulated experiments in a computer, that is described as “in silico” so it is done in a computer rather than a real environment. For example, when you do PCR and you want to amplify a particular DNA fragment, you design primers using bioinformatic tools or software. (T or F)
TRUE
57
Once you have designed a primer, then you can do your actual laboratory experiment, we call it the ____________
Wet lab
58
Where the primer would be optimized and eventually used in the amplification reaction.
Wet lab
59
APPLICATIONS OF BIOINFORMATICS
● Sequence alignment and analysis ● Mapping and analyzing DNA, RNA, Protein, Amino Acid, and Lipid sequences ● Creation and visualization of 3-D structure models for biological molecules of significance, e.g., proteins ● Genome annotation ● Genetic diseases ● Designer Medicine
60
APPLICATIONS IN VARIOUS FIELDS
● Microbial genome applications ● Molecular medicine ● Personalized medicine ● Gene therapy ● Drug development ● Antibiotic resistance ● Evolutionary studies ● Waste cleanup ● Biotechnology ● Climate change studies ● Alternative energy sources ● Crop improvement ● Forensic analysis ● Bio-weapon creation ● Insect resistance ● Improve nutritional quality ● Veterinary science
61
The earliest databases for DNA sequences and proteins were developed by three groups of scientists from different parts of the world:
● Nucleic Acids (International Nucleotide Sequence Database) ● Protein (Worldwide Protein Data Bank)
62
IDENTIFY THE DATABASE DDBJ (DNA Data Bank of Japan)
Nucleic Acids (International Nucleotide Sequence Database)
63
IDENTIFY THE DATABASE EMBL (European Molecular Biology Lab)
Nucleic Acids (International Nucleotide Sequence Database)
64
IDENTIFY THE DATABASE EMBL (European Molecular Biology Lab)
Nucleic Acids (International Nucleotide Sequence Database)
65
IDENTIFY THE DATABASE Genbank (USA)
Nucleic Acids (International Nucleotide Sequence Database)
66
IDENTIFY THE DATABASE PDBj (Japan)
Protein (Worldwide Protein Data Bank)
67
IDENTIFY THE DATABASE RCSB PDB (USA)
Protein (Worldwide Protein Data Bank)
68
DNA Data Bank of Japan
DDBJ
69
Other databases
● Ensembl ● Human metabolome Database (HMDB) ● Gene Expression Databases - Mostly Microarray data ● Phenotypic Databases ● RNA Databases ● Amino Acid/Protein Databases ● Protein-Protein and other Molecular interactions ● Signal Transduction Pathway Databases ● Metabolic Pathway and Protein Function Databases ● Bacterial DNA Databases
70
Database that provides data on the genome of characteristic organisms
Ensembl
71
Very useful particularly if you want to determine the boundary of exons and introns in a eukaryotic gene.
Ensembl
72
GENETIC ANALYSIS APPLICATION
● A disease may arise due to changes the sequence of the gene being expressed ● Single Nucleotide Mutation: Sickle Cell Anemia
73
A consequence of a change that has occurred in the gene of hemoglobin particularly the beta portion of hemoglobin.
Sickle cell anemia
74
Mutations occurred in some individuals such that A is substituted by U so that the codon became GUG which codes for Vaseline. (T or F)
FALSE (Valine NOT VASELINE)
75
In sickle cell anemia there was a point mutation that occurred involving the codon GAG which codes
Glutamic acid
76
Genetic characteristic
Genotype
77
Physical characteristic
Phenotype
78
Recessive trait
Sickle-Cell Anemia
79
REVIEW THE FINDING THE DNA SEQUENCE OF A GENE, OWKI??
OWKI
80
A way of rearranging sequences of DNA, RNA or protein to identify regions of similarity
SEQUENCE ALIGNMENT
81
Sequence alignment is made between
a known sequence (reference sequence) and unknown sequence (query sequence)
82
Reference sequence
Known sequence
83
Query sequence
Unknown sequence
84
TYPES OF SEQUENCE ALIGNMENT
Pairwise Multiple
85
Compare two sequences
Pairwise
86
Compare more than two sequences
Multiple
87
Pairwise
○ EMBOSS WATER ○ BLAST
88
Multiple
○ MUSCLE ○ MAFFT ○ CLUSTAL Omega
89
TYPES OF PAIRWISE SEQUENCE ALIGNMENT
Global alignment Local alignment
90
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Matching the residues (bases or amino acids) of two sequences across their entire length.
Global alignment
91
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT matches the identical sequences
Global alignment
92
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT The two sequences are treated as potentially equivalent
Global alignment
93
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Comparing two genes with the same function (in human vs. mouse)
Global alignment
94
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Comparing two proteins with similar functions
Global alignment
95
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Matching of two sequences from regions which have more similarity with each other
Local alignment
96
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT ○ The two sequences may or may not be related
Local alignment
97
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT to see whether a substring (a part) in one sequence aligns well with a substring (a part) in the other sequence
Local alignment
98
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Searching for local similarities in large sequences (e.g., newly sequenced genomes)
Local alignment
99
IDENTIFY THE TYPE OF PAIRWISE SEQUENCE ALIGNMENT Looking for conserved domains of motifs in two proteins
Local alignment
100
The residues are colored so that you can easily see if there is difference if there is any variation among the sequences.
Clustal omega
101
When you have a multiple sequence alignment, you will be able to determine if all of the sequences are identical by the presence of an __________
Asterisk
102
if there is a variation, there is no asterisk. (T or F)
TRUE
103
MULTIPLE ALIGNMENT TOOLS: Analysis of more than 2 sequences
MUSCLE MAFFT Clustal Omega
104
MUSCLE
Multiple Sequence Comparison by Log Expectation
105
MAFFT
Multiple Alignment using Fast Fourier Transform
106
It is a multiple sequence alignment tool that arranges the sequences of DNA, RNA or protein to identify regions of similarity
MUSCLE (Multiple Sequence Comparison by Log Expectation)
107
Finds regions of local similarity between sequences just like MUSCLE and MAFT
NCBI: Basic Local Alignment Search Tool (BLAST)
108
The amino acid sequences of proteins or the nucleotides of DNA sequences.
NCBI: Basic Local Alignment Search Tool (BLAST)
109
Compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold
NCBI: Basic Local Alignment Search Tool (BLAST)
110
Can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families
NCBI: Basic Local Alignment Search Tool (BLAST)
111
Read additional notes about NCBI: Basic Local Alignment Search Tool (BLAST), owki??
OWKIII
112
Used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families
BLAST
113
You supply multiple sequences to be aligned to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences
MULTIPLE ALIGNMENT
114
Here you supply all the sequences with the tools that we used like MUSCLE.
MULTIPLE ALIGNMENT
115
it will align the sequences that you uploaded and it does not necessarily look for sequences in the database
MULTIPLE ALIGNMENT
116
Read and analyze the difference of multiple sequence alignment and BLAST, and the summary. OWKI??
OWKIII