TY - JOUR
T1 - A pipeline for harmonising NHS Scotland laboratory data to enable national-level analyses
AU - Gao, Chuang
AU - Mumtaz, Shahzad
AU - McCall, Sophie
AU - O'Sullivan, Katherine
AU - McGilchrist, Mark
AU - Morales, Daniel R
AU - Hall, Christopher
AU - Wilde, Katie
AU - Mayor, Charlie
AU - Linksted, Pamela
AU - Harrison, Kathy
AU - Cole, Christian
AU - Jefferson, Emily
N1 - Copyright © 2024. Published by Elsevier Inc.
PY - 2025/2
Y1 - 2025/2
N2 - OBJECTIVE: Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is captured, stored and reported from local health board systems with significant heterogeneity. For researchers or other users of this regionally curated data, working on laboratory datasets across regional cohorts requires effort and time. As part of this study, the Scottish Safe Haven Network have developed an open-source software pipeline to generate a harmonised laboratory dataset.METHODS: We obtained sample laboratory data from the four regional Safe Havens in Scotland covering people within the SHARE consented cohort. We compared the variables collected by each regional Safe Haven and mapped these to 11 FHIR and 2 Scottish-specific standardised terms (i.e., one to indicate the regional health board and a second to describe the source clinical code description) RESULTS: We compared the laboratory data and found that 180 test codes covered 98.7 % of test records performed across Scotland. Focusing on the 180 test codes, we developed a set of transformations to convert test results captured in different units to the same unit. We included both Read Codes and SNOMED CT to encode the tests within the pipeline.CONCLUSION: We validated our harmonisation pipeline by comparing the results across the different regional datasets. The pipeline can be reused by researchers and/or Safe Havens to generate clean, harmonised laboratory data at a national level with minimal effort.
AB - OBJECTIVE: Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is captured, stored and reported from local health board systems with significant heterogeneity. For researchers or other users of this regionally curated data, working on laboratory datasets across regional cohorts requires effort and time. As part of this study, the Scottish Safe Haven Network have developed an open-source software pipeline to generate a harmonised laboratory dataset.METHODS: We obtained sample laboratory data from the four regional Safe Havens in Scotland covering people within the SHARE consented cohort. We compared the variables collected by each regional Safe Haven and mapped these to 11 FHIR and 2 Scottish-specific standardised terms (i.e., one to indicate the regional health board and a second to describe the source clinical code description) RESULTS: We compared the laboratory data and found that 180 test codes covered 98.7 % of test records performed across Scotland. Focusing on the 180 test codes, we developed a set of transformations to convert test results captured in different units to the same unit. We included both Read Codes and SNOMED CT to encode the tests within the pipeline.CONCLUSION: We validated our harmonisation pipeline by comparing the results across the different regional datasets. The pipeline can be reused by researchers and/or Safe Havens to generate clean, harmonised laboratory data at a national level with minimal effort.
KW - Harmonisation
KW - Medical laboratory data
KW - Open-source software pipeline
UR - http://www.scopus.com/inward/record.url?scp=85215410858&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2024.104771
DO - 10.1016/j.jbi.2024.104771
M3 - Article
C2 - 39755323
SN - 1532-0464
VL - 162
SP - 104771
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
M1 - 104771
ER -