DNA Sequencing Methods: Sanger DNA Sequencing
(1) DNA sequencing
Components:
* Template DNA
* Oligonucleotide primer
* DNA polymerase
* Deoxynucleoside triphosphates (dNTPs)
* (2)
In such a reaction mixture, DNA synthesis continues until a (2) , rather than a dNTP, is added to the growing chain.
* Without a (3) to attack the (4) of the next dNTP to be incorporated, synthesis stops.
Sanger Method
DNA synthesis continues until a ddNTP, rather than a dNTP, is added to the growing chain.
* Without a 3′-OH group, synthesis stops.
(1) reactions prepared, one for (2).
* Results in a collection of DNA fragments of varying lengths, each ending in the same ddNTP.
Samples prepared for automated sequencing.
Automated Sanger DNA Sequencing
Next-Generation DNA Sequencing (NGS)
(1) sequencing techniques—thousands of identical DNA fragments are sequenced simultaneously.
Uses (2) templates attached to a solid substrate.
Makes genomic sequencing faster and cheaper.
* Avoids need to insert individual DNA fragments into vectors.
Sequencing By Synthesis
Identifies each nucleotide as (1).
Synthesis uses a modified fluorescent nucleotide that stops the reaction because the (2).
* Nucleotide does not (3) until it is incorporated into growing DNA strand.
Whole-Genome Shotgun Sequencing
Four stage process:
* (1)—generates clones of portions of genome.
* (2)— determines sequences of genome fragments in vector.
* (3)—computer analysis joins overlap regions to form contig.
* (4)—proofreading ensures all reads of the same sequence are identical.
Next-Generation Genomic Sequencing
Depth of coverage and breadth of coverage improved.
* Coverage— (1) sequenced in a genome.
– 18-100 times is (2).
– (3)—high average number of reads.
Great depth of coverage increases accuracy.
* Decreases the chance that sequence data generated is from (4).
Single-Cell Genomic Sequencing
Femtograms of DNA from a cell are (1).
(2)
* Occurs at single temperature and uses the DNA polymerase from bacteriophage phi29 to synthesize new DNA.
Metagenomics Provides Access to Uncultured Microbes
Metagenomics
* Study of microbial genomes based on (1).
* Used to learn more about the diversity and metabolic potential of microbial communities.
– Each DNA fragment comes from a collection of genomes found in a (2).
* Established the Genomic Encyclopedia of Bacteria and Archaea project.
– Improves reference databases by sequencing genomes of a wide variety of cultured microorganisms.
Bioinformatics
(1)
* Bioinformatics combines biology, mathematics, computer science, and statistics to convert raw nucleotide data into (2).
Further examination carried out using (3).
Genome Annotation
Process that locates genes in the genome map.
Identifies (1).
* must be (2) that is not interrupted by a (3).
* must have (4) at the 5’ end and (5) at the 3’ end.
Bioinformatics—ORFs and BLAST
ORFs that appear to encode protein are called (1).
(2)
* Base-by-base comparison of 2+ gene sequences.
* Assign tentative function of gene or protein structure.
Gene/ORF Terminology
(1)—genes from different organisms with similar ORFs.
(2)—two or more genes with very similar nucleotides sequences, likely from a duplication event.
(3)—naming standards for proteins and their motifs based on similarities among orthologous proteins.
Proteins that don’t align with known amino acid sequences fall into two categories.
* (4)—match known sequences in databases, but don’t have an assigned function (yet).
* (5)—products of genes unique to that organism.
Functional Genomics Links Genes to Phenotypes
Functional Genomics
Puts genomic information in a (1).
Provides information on:
* (2)
* (3)
* (4)
DNA Microarray Analysis
Determines which genes are expressed at a specific time.
Arrays are (1) -> organized as a grid.
Each DNA spot (2) represents a single gene.
* (2) is usually a PCR product generated from a (3).
Analysis of Gene Expression Using Microarrays
Based on hybridization between the (1) and the (2) from the microbe of interest.
* Reverse transcriptase converts mRNA to cDNA.
* cDNAs are (3) and incubated with the microarray.
* (4) is washed off.
* The microarray is scanned with (5) which indicates that hybridization has
occurred.
RNA-Seq Method for Transcriptome Analysis
NGS method quantifying mRNA levels by measuring the “reads” matching each gene.
Cellular mRNA is converted to cDNA by (1).
cDNA are then (2).
Sequence data can be used in two ways:
* If sequenced genome exists, the cDNA are (3).
* If no reference genome, nucleotide sequence is (4).
Hierarchical Cluster Analysis
Organization of transcriptomic experiments.
Groups genes according to ().
E.g.
* Upregulated—red
* Downregulated—green
* Unchanged—black
level of expression
Metatranscriptomics
Extraction of RNA (1), followed by sequencing and comparison to known sequences.
Describes transcriptome of (2).
* Yield transcripts that map to multiple reference genomes as well as some that fail to align with any reference genome.
* Novel transcripts represent newly discovered genes.
Systems Biology: Making and Testing Complex Predictions
(1) with the molecular interactions that become pathways for catabolism, anabolism, regulation, behavior, environmental responses, etc.
* (2) of cells.
* May be important in studying (3).
Comparative Genomics
(1) inferred by studying similar nucleotide and amino acid sequences among organisms.
Comparisons of genomes of strains within species and among species.
* (2)—genes that are no longer functional.
Comparative genomics is the tool by which virologists and evolutionary biologist have tracked the development of new strains of SARS-CoV-2.
Comparative Genomics In Microbial Groups
HGT Impacts Microbial Evolution
Core genome—set of genes found (1).
* Often thought of as the (2).
Pan-genome—(3) in a strain of a species.
* More recently acquired genes that enable (4).