SU4 Flashcards

Linkage and genome wide association studies (68 cards)

1
Q

What are genetic markers?

A

They are known, heritable DNA sequence variants used to track inheritance through families or populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name three key characteristics of an ideal genetic marker

A

They should be polymorphic (exist in multiple forms), have known positions on chromosomes (mapped), and usually not affect the phenotype (neutral).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of genetic marker consists of short tandem repeats with a variable copy number, such as (CA)n repeats?

A

Microsatellites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which type of genetic marker is a single base change in the DNA sequence?

A

A Single Nucleotide Polymorphism (SNP).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the primary goal of using genetic markers in family studies?

A

To find the location of disease genes by tracking which markers co-segregate with the disease in families.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The principle that markers physically close to a disease gene are often inherited together is known as _____.

A

Linkage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of map shows the positions of DNA markers along chromosomes and reflects recombination frequency?

A

A genetic map.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of map shows the actual position of genes or markers along the DNA, measured in base pairs?

A

A physical map.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is dbSNP?

A

It is a public database of single nucleotide polymorphisms (SNPs) and other small genetic variants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In linkage analysis, what is the fundamental concept regarding the inheritance of alleles?

A

Alleles (representing genes or markers) that are physically close to each other are inherited together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a haplotype?

A

A set of alleles on a single chromosome that are inherited together through meiosis because they are not readily disrupted by recombination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the term for the probability that recombination will occur between two loci?

A

The recombination fraction, denoted by θ (theta).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If two loci are very far apart on a chromosome, what is their expected recombination fraction (θ)?

A

The expected recombination fraction is 0.5, indicating they are unlinked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If two loci are very close together on a chromosome, what is their expected recombination fraction (θ)?

A

The expected recombination fraction is 0, indicating complete linkage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the unit of genetic map distance, where 1 unit corresponds to a 1% chance of recombination between two loci?

A

A centiMorgan (cM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If two loci are 1 centiMorgan (cM) apart, what is their recombination fraction (θ)?

A

Their recombination fraction is 0.01.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an ‘informative marker’ in linkage analysis?

A

A marker that provides clear information about which allele was inherited from which parent, typically because the parent is heterozygous for that marker.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What statistical value is calculated to determine the significance of linkage between a marker and a disease locus?

A

The LOD score (Logarithm of the Odds).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The LOD score is the logarithm of the ratio of two probabilities. What are these two probabilities?

A

The likelihood of the data if the marker is linked at a specific θ, and the likelihood of the data if the marker is unlinked (θ = 0.5).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What LOD score is generally considered the threshold for significant evidence of linkage?

A

A LOD score of +3 or greater.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In LOD score analysis, what does θ
max represent?

A

It represents the best estimate of the recombination frequency between the marker and disease loci, found where the LOD score is at its maximum (Z
max

).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Linkage analysis that assumes a particular mode of inheritance (e.g., autosomal dominant) is known as what type of method?

A

A parametric method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is locus heterogeneity?

A

A phenomenon where the same disease phenotype in different families is caused by mutations in different genes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does Whole Exome Sequencing (WES) specifically sequence?

A

It sequences only the protein-coding regions (exons) of the genome, which constitute about 1% of the total genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the primary rationale for using WES to find disease mutations?
Approximately 85% of known disease-causing mutations are found within the exons.
26
List one key advantage of WES over whole genome sequencing.
WES is faster, cheaper, and more efficient for identifying mutations in protein-coding regions.
27
For which type of disorders is WES particularly useful for gene identification?
Rare monogenic disorders.
28
What is the first step in the bioinformatics analysis of WES data to find a causative mutation
To identify rare or deleterious variants within the sequenced exons
29
How do polygenic (multifactorial) disorders differ from single-gene (monogenic) disorders in terms of genetic cause?
Polygenic disorders are caused by many common variants, each with a small effect, whereas monogenic disorders are caused by a rare variant in a single gene.
30
What concept describes how, in polygenic disorders, disease occurs only when the combined genetic and environmental risk exceeds a certain point
The liability threshold model.
31
What does heritability (h ^2) measure?
It measures the proportion of the total phenotypic variance in a population that is due to genetic factors.
32
In family studies, what is the risk ratio (λ)?
It is the risk of disease to a relative of an affected individual divided by the risk in the general population.
33
In twin studies, what does 'concordance' mean?
It means that both twins in a pair have the disease or trait.
34
If the concordance rate for a disease is significantly higher in monozygotic (MZ) twins than in dizygotic (DZ) twins, what does this suggest?
It suggests a strong genetic component to the disease.
35
Which type of study design is considered the best for separating genetic influences from environmental ones?
Adoption studies, particularly those involving siblings separated at birth.
36
Why does traditional linkage analysis often fail for common, complex diseases?
Because these diseases lack a clear mode of inheritance, have reduced penetrance, and are affected by genetic heterogeneity.
37
How is 'association' in genetics defined differently from 'linkage'?
Association is a statistical relationship between an allele and a phenotype in a population, whereas linkage is a physical relationship between loci on a chromosome.
38
Which study design is typically used for linkage analysis?
Family-based studies using pedigrees.
39
Which study design is typically used for association studies?
Population-based studies, such as case-control studies.
40
Linkage analysis is best for detecting rare variants with _____ effect sizes, while association studies are best for common variants with _____ effect sizes.
high; small.
41
What statistical measure is used to quantify the strength of association in a case-control study?
The odds ratio (OR).
42
If an allele has an odds ratio (OR) greater than 1 for a disease, what does this indicate?
It indicates the allele is associated with an increased risk of the disease.
43
If an odds ratio (OR) is equal to 1, what does this imply?
It implies there is no association between the allele and the disease.
44
What is the critical distinction to make when a statistical association is found between an allele and a disease?
Association does not equal causation; the allele may be a marker in proximity to the true causal variant, or the result could be a false positive.
45
What is Linkage Disequilibrium (LD)?
It is the non-random association of alleles at nearby loci on a chromosome, meaning they are inherited together more often than expected by chance.
46
How does linkage disequilibrium (LD) arise in a population?
It arises because individuals share chromosome segments inherited from distant common ancestors that have not yet been broken up by recombination.
47
What was the primary goal of the International HapMap Project?
To create a comprehensive map of human genetic variation and define the patterns of linkage disequilibrium across the genome.
48
The HapMap project revealed that the human genome is organised into stretches of DNA with high LD, known as _____.
Haplotype blocks.
49
What is a 'tag SNP'?
A single SNP that can be used as a marker to represent the variation within an entire haplotype block due to high LD.
50
How did the HapMap project enable Genome-Wide Association Studies (GWAS)?
It showed that genotyping a smaller number of tag SNPs could capture most common variation, making genome-wide scans cost-effective.
51
What is the typical study design for a GWAS?
A case-control study comparing allele frequencies of hundreds of thousands of SNPs between affected individuals and unaffected controls.
52
Why is a strict statistical threshold needed in GWAS?
To correct for the multiple testing problem, as testing millions of SNPs increases the risk of false positives.
53
What is the conventional p-value threshold for genome-wide significance in a GWAS?
A p-value less than 5×10 −8 .
54
What type of plot is used to visualise the results of a GWAS?
A Manhattan plot.
55
On a Manhattan plot, what does the y-axis represent?
It represents the strength of association for each SNP, plotted as the −log 10 (p−value).
56
What is a potential source of confounding in GWAS that can lead to false associations?
Differences in ancestry or hidden relatedness between cases and controls (population stratification).
57
What is the term for the observation that GWAS-identified SNPs explain only a small fraction of the heritability estimated from family studies?
The 'missing heritability' problem.
58
List two potential reasons for the 'missing heritability' problem.
Strict statistical thresholds miss many true small-effect SNPs, and standard GWAS designs do not capture the contribution of rare variants.
59
What occurs when the effect of a genetic variant on disease risk is modified by an environmental factor?
A gene-environment interaction.
60
The LIPC gene provides an example of gene-environment interaction where the effect of a genotype on HDL cholesterol levels is dependent on _____.
Dietary fat intake.
61
What is epigenetics?
The study of heritable changes in gene expression that do not involve alterations to the underlying DNA sequence itself.
62
Name two primary mechanisms of epigenetic modification.
DNA methylation and histone modification.
63
How can environmental factors like diet or stress influence disease risk via epigenetic mechanisms?
They can alter the epigenome (e.g., change methylation patterns), which in turn influences the expression of disease-related genes.
64
What long-term health effects were observed in individuals exposed to the Dutch Hongerwinter Famine in utero?
An increased risk of metabolic diseases, such as obesity and cardiovascular disease, in later life.
65
What is the difference between parametric and non-parametric linkage analyses?
Parametric linkage analyses require a specific genetic model that gives details of certain key parameters: the mode of inheritance, disease gene frequency, and penetrance of disease genotypes. Non-parametric linkage analyses do not require any genetic model to be stipulated and have been deployed to analyse segregation patterns in complex disease
66
Under ] what circumstances are parametric and non-parametric linkage analysis applied to study human genetic disease?
Parametric linkage analyses have been very successful in mapping genes for Mendelian disorders, and Mendelian subsets of a complex disease. Affected sib-pair analysis, relies simply on analysing affected sibs only in multiple families. Non-parametric used for complex diseases where the inheritance model is unknown.
67
What is meant by the odds ratio in case-control studies?
Odds ratio is a calculation (ratio) of the odds of being affected if you carry a particular genetic variant, compared to the odds of being affected if you do not carry the particular variant. It is the first divided by the latter.
68
Strategies to identify the genes that underlie single gene disorders have often relied on first obtaining a sub-chromosomal location (positional cloning) for the disease gene. Describe two approaches that have been taken to identify sub-chromosomal locations for these disorders.
Linkage analyses. Testing markers from across all the chromosomes to see if alleles at any of the marker loci show a tendency to co-segregate with the disease in families. Cytogenetic analysis to look for disease-associated chromosomal rearrangements. The most profitable have been translocations and inversions