Topological data analysis reveals genotype-phenotype relationships in primary ciliary dyskinesia

Amelia Shoemark, Bruna Rubbo, Marie Legendre, Mahmood R. Fassad, Eric G. Haarman, Sunayna Best, Irma C. M. Bon, Joost Brandsma, Pierre-Regis Burgel, Gunnar Carlsson, Siobhan B. Carr, Mary Carroll, Matt Edwards, Estelle Escudier, Isabelle Honoré, David Hunt, Gregory Jouvion, Michel R. Loebinger, Bernard Maitre, Deborah Morris-RosendahlJean-Francois Papon, Camille M. Parsons, Mitali P. Patel, Simon N. Thomas, Guillaume Thouvenin, Woolf T. Walker, Robert Wilson, Claire Hogg, Hannah M. Mitchison, Jane S. Lucas

Research output: Contribution to journalArticlepeer-review


Background: Primary ciliary dyskinesia (PCD) is a heterogeneous inherited disorder caused by mutations in approximately 50 cilia-related genes. PCD genotype-phenotype relationships have mostly arisen from small case series because existing statistical approaches to investigate relationships have been unsuitable for rare diseases.

Methods: We applied a topological data analysis (TDA) approach to investigate genotype-phenotype relationships in PCD. Data from separate training and validation cohorts included 396 genetically defined individuals carrying pathogenic variants in PCD genes. To develop the TDA models, twelve clinical and diagnostic variables were included. TDA-driven hypotheses were subsequently tested using traditional statistics.

Results: Disease severity at diagnosis measured by FEV1 z-score was (i) significantly worse in individuals with CCDC39 mutations compared to other gene mutations and (ii) better in those with DNAH11 mutations; the latter also reported less neonatal respiratory distress. Patients without neonatal respiratory distress had better preserved FEV1 at diagnosis. Individuals with DNAH5 mutations were phenotypically diverse. Cilia ultrastructure and beat pattern defects correlated closely to specific causative gene groups, confirming these tests can be used to support a genetic diagnosis.

Conclusions: This large scale multi-national study presents PCD as a syndrome with overlapping symptoms and variation in phenotype, according to genotype. TDA modelling confirmed genotype-phenotype relationships reported by smaller studies (e.g. FEV1 worse with CCDC39 mutations), and identified new relationships, including FEV1 preservation with DNAH11 mutations and diversity of severity with DNAH5 mutations.

Original languageEnglish
JournalEuropean Respiratory Journal
Early online date21 Jan 2021
Publication statusE-pub ahead of print - 21 Jan 2021

Fingerprint Dive into the research topics of 'Topological data analysis reveals genotype-phenotype relationships in primary ciliary dyskinesia'. Together they form a unique fingerprint.

Cite this