Abstract
Background: Evolutionary forces have shaped the way humans respond to complex diseases. Type 2 diabetes is a complex, heterogenous and polygenic condition. The disease has reached epidemic proportions with individuals across ethnicities being unequivocally affected. Ever since human migration out of Africa, rapid epidemiological, socio-economic and demographic changes have occurred in populations. These coupled with genetic diversity in humans has led to phenotypic variations between populations, a large part of which remains unexplored. Phenotypically, South Indians and the Scottish express differential susceptibility to T2D. South Indians develop diabetes a decade or two earlier, at lower BMI but display greater abdominal adiposity and with higher levels of dyslipidaemia as compared to the Scottish population. Since genes regulating metabolic functions have shown some of the strongest responses to adaptive forces, it is most likely to have influenced patterns of T2D prevalence in the two populations.Aim: The thesis aims to investigate the role of selection pressures affecting various phenotypic traits related to Type 2 Diabetes in a Scottish and novel South Indian population. The study was extended to explore the evolutionary aspects of Covid-19 related genes in the same cohorts.
Methods: Population differentiation was estimated using pairwise calculation of Weir and Cockerham’s FST. Highly differentiated SNPs were tested for their association with various clinical phenotypes of Type 2 Diabetes in both the study populations. These included age at onset, anthropometry, blood pressure and lipids. Following downstream analysis, the significant loci were fine mapped to assess causality. Further, sex-differentiated analysis and trans-ethnic meta-analysis were also done. The presence of a selective sweep was tested for using both frequency and linkage disequilibrium-based methods. The gene expression of the most probable causal SNPs was studied.
Results: 4,854 highly differentiated SNPs with FST > 0.25 were used for further analysis. Most of the SNPs have a higher frequency in the Scottish cohort, indicating positive selection in the population. Among the clinical phenotypes studied, a likely causal association was detected with age at onset, BMI and lipids in only the Scottish cohort. With respect to age at onset of Type 2 diabetes, rs9273410 (P-value 8.83*10-2, Posterior Probability 0.99) displayed the highest probability of causality. Although statistically insignificant in the original association analysis, its close proximity to rs9273242 (β= -0.634, SE=0.148, P-value 1.95*10-5, Posterior Probability 0.93) and its location in the UTR 3 region of the genome, gave it the highest priority in fine mapping. Both were mapped to the HLA-DQB1 locus and were found to lower age at onset. rs16891982 found at or near the skin pigmentation gene, SLC45A2, was a novel locus and among the most likely causal SNPs (β= 2.436, SE=0.615, P-value 7.62*10-5, Posterior Probability 0.99). The SNP was exonic, had a missense variation and was out of HWE. It was found to be protective, delaying onset by nearly 2.5 years. The G allele of rs16891982 was nearly fixed in the Scottish population with a frequency of 96.5% and a selection sweep was observed across the locus.
SNPs within ADAMTS9-AS2 region were associated with BMI. rs4422297 (β= 0.385, SE=0.085, P-value 5.92*10-6, Posterior Probability 1.0) was most certainly causal. The G allele of rs4422297 influenced various adiposity measures differentially as per various reports. While it was found to raise to raise BMI, body fat percentage and hip circumference, it was protective towards measures of abdominal obesity. The BMI raising, effect allele G of rs4422297 is a derived allele and has a frequency of 30% in the Scottish population. This is in contradiction to the thrifty genotype hypothesis, according to which, the ancestral, positively selected allele should have been the causing obesity and hence influencing T2D.
The HLA-DQA1/B1 locus was found to influence HDL as well as triglycerides. rs9273415 (β= 0.032, SE=0.005, P-value 1.75*10-9, Posterior Probability 0.99) and rs9273410 (β= 0.027, SE=0.005, P-value 4.91*10-6, Posterior Probability 0.95) were protective against dyslipidaemia. Both the SNPs had an average frequency of ~55% in the Scottish cohort. rs11216267 in the SIK3 locus (β= -0.02, SE=0.005, P-value 5.79*10-5) was a risk allele lowering HDL marginally. The T allele of rs11216267 had a frequency of 65.2% in the Scottish cohort. The HLA locus was most likely influencing lipid levels in the T2D cohort due to a collider bias, while the SIK3 gene is a known cardiovascular risk gene.
With respect to Covid-19, within the ACE2 gene of the X chromosome, 28 SNPs had shown evidence of balancing selection. 2 of the SNPs (rs233575 and rs2106809) were associated with age at onset of T2D only in the South Indian population. No significant association was detected within the Scottish cohort. The LZTFL1 gene, carried over from the Neanderthals and implicated in Covid-19 was not found to affect any of the glycaemic traits in either of the two populations. Amongst all these genes, a selective sweep was observed in the SLC45A2 region, where as the HLA and ACE2 genes showed evidence of balancing selection.
Conclusion: Evolutionary forces have influenced the phenotypic diversity in T2D between Scotland and South India. Selection typically influences traits before or during the reproductive age of individuals. Since T2D usually begins later in life, selection pressures are most likely to affect the underlying causes of T2D rather than the disease itself. While variants within the HLA locus were shown to lower age at onset, they were protective towards dyslipidaemia in the Scottish cohort. The skin pigmentation gene, SLC45A2 was also found to be protective by delaying onset. The risk allele, raising BMI in the ADAMTS9-AS2 gene had a lower frequency in the Scottish population. No effect was observed with respect to the South Indian T2D traits, most likely due to gene-gene or gene-environment interactions. Functional studies in the most likely causal SNPs will help elucidate the exact mechanism of action of these regions.
Limitations: The study has a few limitations. Genotype imputation in GoDARTS was done using the 1000 genomes reference panel, while the HRC reference panel was used for the GoSHARE and MDRF cohorts which has individuals from a European ancestry. An in-house developed South Asian (SAS)reference panel was available at a later stage. The results for the X chromosome work on the ACE2 gene implicated in Covid-19 were investigated using SAS imputed data. Improved results due to greater quality of imputation and a larger number of SNPs passing quality control measures were observed. The same is yet to be replicated for the rest of the thesis. The results of the work on the Scottish population could not be replicated due to lack of data. However, where possible, the results were compared to reports from available resources such as the UKBiobank, IEU Open GWAS, T2D GWAS portals and published studies.
Date of Award | 2023 |
---|---|
Original language | English |
Awarding Institution |
|
Sponsors | National Institute for Health and Care Research |
Supervisor | Colin Palmer (Supervisor) |