Quantifying the extent to which index event biases influence large genetic association studies

Hanieh Yaghootkar, Michael P. Bancks, Sam E. Jones, Aaron McDaid, Robin Beaumont, Louise Donnelly, Andrew R. Wood, Archie Campbell, Jessica Tyrrell, Lynne J. Hocking, Marcus A. Tuke, Katherine S. Ruth, Ewan R. Pearson, Anna Murray, Rachel M. Freathy, Patricia B. Munroe, Caroline Hayward, Colin Palmer, Michael N. Weedon, James S. PankowTimothy M. Frayling (Lead / Corresponding author), Zoltán Kutalik (Lead / Corresponding author)

Research output: Contribution to journalArticlepeer-review

30 Citations (Scopus)
142 Downloads (Pure)


As genetic association studies increase in size to 100,000s of individuals, subtle biases may influence conclusions. One possible bias is "index event bias" (IEB) that appears due to the stratification by, or enrichment for, disease status when testing associations between genetic variants and a disease-associated trait. We aimed to test the extent to which IEB influences some known trait associations in a range of study designs and provide a statistical framework for assessing future associations. Analysing data from 113,203 non-diabetic UK Biobank participants, we observed three (near TCF7L2, CDKN2AB and CDKAL1) overestimated (BMI-decreasing) and one (near MTNR1B) underestimated (BMI-increasing) associations among 11 type 2 diabetes risk alleles (at P < 0.05). IEB became even stronger when we tested a type 2 diabetes genetic risk score composed of these 11 variants (-0.010 SDs BMI per allele, P = 5x10(-4)), which was confirmed in four additional independent studies. Similar results emerged when examining the effect of blood pressure increasing alleles on BMI in normotensive UK Biobank samples. Furthermore, we demonstrated that, under realistic scenarios, common disease alleles would become associated at p < 5x10(-8) with disease-related traits through IEB alone, if disease prevalence in the sample differs appreciably from the background population prevalence. For example, some hypertension and type 2 diabetes alleles will be associated with BMI in sample sizes of > 500,000 if the prevalence of those diseases differs by > 10% from the background population. In conclusion, IEB may result in false positive or negative genetic associations in very large studies stratified or strongly enriched for/against disease cases.

Original languageEnglish
Pages (from-to)1018-1030
Number of pages13
JournalHuman Molecular Genetics
Issue number5
Early online date30 Dec 2016
Publication statusPublished - 1 Jun 2017


  • alleles
  • hypertension
  • body mass index procedure
  • diabetes mellitus
  • type 2
  • blood pressure
  • genetics
  • stratification
  • genetic risk
  • false-positive results
  • tcf712 gene
  • biobanks


Dive into the research topics of 'Quantifying the extent to which index event biases influence large genetic association studies'. Together they form a unique fingerprint.

Cite this