Define Gene association
Genetic Association is the presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait
How do we determine whether variants in the genome are associated with a disease?
If we substitute the word “disease” for trait” this is how we determine whether variants in the genome are associated with a disease
With disease = cases
Without disease = controls
In a case-control study when is there a disease present?
Gene is associated with disease as there are more cases than controls
What are the 4 major rules of case-control studies?
Using a simple flow map, how do we identify regions that are responsible for cause disease?
On image
How do we carry it out in practise?
Why do we need reliable genetic markers?
• Individuals in a population are genetically far more diverse than individuals in a single family.
What are genetic markers?
• Genetic markers are alleles that we can genotype and assess whether they are associated with disease
Define assoication
• Association means <100kb from a causal variant
What is the ideal genetic marker?
What is an SNP?
How might an SNP arise?
On image
Where are SNPS found?
• Gene (coding region)
No amino acid change (synonymous)
Amino acid change (non-synonymous)
New stop codon (nonsense)
• Gene (non-coding region)
Promoter – mRNA and protein level changed
Terminator - mRNA and protein level changed
Splice site – Altered mRNA, altered protein
• Intergenic region (98% of genome)
What is dbSNP?
The Single Nucleotide Polymorphism Database
Allows us to find information about SNPS
What is the minor allele in dbSNP?
The less common allele, dbSNP allows us to see this in SNPS
Why are SNPS chosen?
What is GWAS?
Genome Wide Association Study (GWAS)
• Recruit large numbers of cases and controls
• Genotype markers across the whole genome
SNP Microarrays – see separate session
• Look for association between disease and alleles of each marker – chi-squared test
• Positive association is at p<5x10-8 (multiple testing correction)
What does a GWAS give us in terms of results?
P value a value of confidence – measure of validity
Large numbers means more significant
Refer to table
How do we plt the results of a GWAS?
What is the manhattan project?
The Manhattan plot is a simple way to visualise the markers across the genome associated with the disease. The y-axis of the plot is the –log(base10) of the p-value, so if a marker is associated with disease with a p-value of 1x10-9 then the value on the y-axis for this would be 9. The x-axis is the location on the chromosome. Each chromosome is a different colour in the plot above and chromosome locations are given by the number of bases from the start of the chromosome sequence.
What did the Wellcome Trust Case Control Consortium (WTCCC) – the first genetic wide association study in 2007 look at?
What were the results?
Have a look at the regional association plot, what does the red identify?
• Had a look at several diseases
On image
This red SNP covers a few genes – but has a high significance – responsible for this peak
What is a meta-analysis?
combine different studies:
• Difficult to do large studies (>1K cases/controls)
• Easier to combine smaller studies
Pre-experiment – Consortium
Post-experiment – Meta-analysis
Meta-analysis allows the statistical combination of results from multiple studies
What are the problems with GWAS?
• GWAS has identified associations that are statistically strong and reproducible
• However, their contribution to the genetic component of disease is estimated to be low (<5%)
• Possible answers:
Many common SNPs of very small effect
Rare SNPs
Copy Number Variation
Epigenetic variation
What are the medical implications of obesity?
On image
Why is obesity strongly genetic?
• Twin studies 70-80% of body shape is genetically determined • Adoption studies 30-40% • Family studies 40-60%