A pipeline for harmonising NHS Scotland laboratory data to enable national-level analyses

Chuang Gao, Shahzad Mumtaz, Sophie McCall, Katherine O'Sullivan, Mark McGilchrist, Daniel R Morales, Christopher Hall, Katie Wilde, Charlie Mayor, Pamela Linksted, Kathy Harrison, Christian Cole (Lead / Corresponding author), Emily Jefferson (Lead / Corresponding author)

Research output: Contribution to journalArticlepeer-review

Abstract

OBJECTIVE: Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is captured, stored and reported from local health board systems with significant heterogeneity. For researchers or other users of this regionally curated data, working on laboratory datasets across regional cohorts requires effort and time. As part of this study, the Scottish Safe Haven Network have developed an open-source software pipeline to generate a harmonised laboratory dataset.

METHODS: We obtained sample laboratory data from the four regional Safe Havens in Scotland covering people within the SHARE consented cohort. We compared the variables collected by each regional Safe Haven and mapped these to 11 FHIR and 2 Scottish-specific standardised terms (i.e., one to indicate the regional health board and a second to describe the source clinical code description) RESULTS: We compared the laboratory data and found that 180 test codes covered 98.7 % of test records performed across Scotland. Focusing on the 180 test codes, we developed a set of transformations to convert test results captured in different units to the same unit. We included both Read Codes and SNOMED CT to encode the tests within the pipeline.

CONCLUSION: We validated our harmonisation pipeline by comparing the results across the different regional datasets. The pipeline can be reused by researchers and/or Safe Havens to generate clean, harmonised laboratory data at a national level with minimal effort.

Original languageEnglish
Article number104771
Pages (from-to)104771
JournalJournal of Biomedical Informatics
Volume162
Early online date2 Jan 2025
DOIs
Publication statusPublished - Feb 2025

Keywords

  • Harmonisation
  • Medical laboratory data
  • Open-source software pipeline

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'A pipeline for harmonising NHS Scotland laboratory data to enable national-level analyses'. Together they form a unique fingerprint.

Cite this