Description
Defining a disease (phenotyping) is critical in extracting a cohort of patients for epidemiological studies. Various methods — such as clinical code lists (for clinically coded data), rule-based algorithms, machine learning methods, and natural language processing techniques (which use free text data) — to define a cohort of patients from the routinely collected electronic healthcare records (EHRs) are applied. They are often shared as supplementary material to publications or on GitHub repositories. This is an inefficient system and recently separate phenotype resources have been developed worldwide to host clinical code lists and rule-based algorithms. Still, the user must download definitions for reusing them in their analysis and make changes to adopt variations in representations to scale analysis across conditions and healthcare settings. There was a need for standardised definitions and representations to support reproducibility across conditions and in different healthcare settings. Examining existing resources and user requirements, we devised desiderata for developing next-generation phenotyping resources. The results are the Health Data Research (HDR) UK Phenotype Library created by following the desiderata to make phenotype definitions FAIR (Findability, Accessibility, Interoperability and Reusability). In parallel, an open-source application programming interface (API) and client R package were developed to directly enable the use of definitions programmatically. Using standardised definitions from the phenotype library resulted in the development of an informatics tool enabling epidemiological studies for more than 100 disease conditions.Period | 5 Sept 2023 |
---|---|
Event title | Research Software Engineering Conference 2023 |
Event type | Conference |
Conference number | 7 |
Location | Swansea, United KingdomShow on map |
Documents & Links
Related content
-
Research Facilities
-
Health Informatics Centre
Facility/equipment: Facility