Abstract
Twenty years after the end of the Human Genome Project and fifteen after the Wellcome Trust Case Control Consortium, Genome-Wide Association Studies are now routinely applied to discover links between genetics and traits and disease, in studies involving millions of individuals. The thousands of discovered genetic associations have accelerated the need for methods and assays that map these associations to the molecules and causal mechanisms that lead to disease development. Studies of the effect of genetic variants on the molecular phenotypes involved in these causal mechanisms, looking at gene expression, protein-protein interactions and metabolites, have also expanded in scale and scope.However, technical constraints and the complexity of biological networks mean that it is extremely challenging to make sense of this wealth of data and effectively unravel the causal mechanisms that link DNA variation to changes in disease risk.
In this thesis, we investigated how genetic effects propagate through different layers of molecular phenotypes to affect whole-organism traits and diseases. We studied the contribution of genetic variants and environmental effects to whole blood gene expression and plasma protein levels, using relatedness between 67 individuals from the same family and genotypes from 3029 unrelated individuals. We found heritability estimates similar to previous studies and that these estimates were consistent between the two cohorts. However, we observed only a low genetic correlation between expression and proteins, highlighting technical limitations and suggesting the presence of buffering and post-transcriptional effects.
We then studied how the consequences of genetically predicted extreme expression can reveal non-additive effects. We predicted gene expression for 45 tissues in UK Biobank individuals, using eQTLs from the GTEx and DIRECT consortia and identified individuals with predicted extreme expression. We found 710 associations between extreme expression and anthropometric traits, independent of the additive effect of expression. Most commonly, we observed attenuation of additive effects in the extremes, suggesting the presence of mechanisms of compensation for extreme molecular phenotypes. We also show how incomplete horizontal pleiotropy between a causal and non-causal gene can produce artefactual associations with extreme phenotypes, and how this can also cause false negatives for Mendelian randomisation methods.
Finally, we developed PERiGene, a similarity-based method for identifying likely causal genes near GWAS loci using gene characteristics such as tissue-specific expression and interaction with other genes. Our predictions were enriched around GWAS loci, despite PERiGene having no access to local genomic information. PERiGene was also able to identify known risk genes around these loci. Finally, we showed how PERiGene and other similarity-based methods are vulnerable to information biases and show how this can be corrected for.
These contributions will hopefully help our understanding of how genetic information affects molecular processes to eventually influence human traits.
Date of Award | 2024 |
---|---|
Original language | English |
Supervisor | Andrew Brown (Supervisor) & Ewan Pearson (Supervisor) |