An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population

Thomas Nind, James Sutherland, Gordon McAllister, Douglas Hardy, Ally Hume, Ruairidh MacLeod, Jacqueline Caldwell, Susan Krueger, Leandro Tramma, Ross Teviotdale, Mohamed Abdelatif, Kenny Gillen, Joe Ward, Donald Scobbie, Ian Baillie, Andrew Brooks, Bianca Prodan, William Kerr, Dominic Sloan-Murphy, Juan Rodriguez HerreraDan McManus, Carole Morris, Carol Sinclair, Rob Baxter, Mark Parsons, Andrew Morris, Emily Jefferson

Research output: Contribution to journalArticle

4 Downloads (Pure)

Abstract

Aim: To enable a world-leading research dataset of routinely collected clinical images linked to other routinely collected data from the whole Scottish National population. This includes 30 million different radiological examinations from a population of 5.4 million and over 2 petabytes of data collected since 2010.

Methods: Scotland has a central archive of radiological data used to directly provide clinical care to patients. We have developed an architecture and platform to securely extract a copy of that data, link it to other clinical or social data sets, remove personal data to protect privacy, and make the resulting data available to researchers in a controlled Safe Haven environment.

Results: An extensive software platform has been developed to host, extract and link data from cohorts to answer research questions.
The platform has been tested on 5 different test cases and is currently being further enhanced to support 3 exemplar research projects.

Conclusions: The data available is from a range of radiological modalities, scanner types and collected under different environmental conditions. This “real-world”, heterogenous data is highly valuable for training algorithms to support clinical decision making, especially for deep learning where large data volumes are required. The resource is now available for international research access.
The platform and data can support new health research using Artificial Intelligence and Machine Learning technologies as well as enabling discovery science.
Original languageEnglish
Article numbergiaa095
Number of pages13
JournalGiga Science
Volume9
Issue number10
Early online date29 Sep 2020
DOIs
Publication statusPublished - Oct 2020

Keywords

  • Radiology
  • Big Data
  • AI
  • ML

Fingerprint Dive into the research topics of 'An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population'. Together they form a unique fingerprint.

  • Projects

    Cite this

    Nind, T., Sutherland, J., McAllister, G., Hardy, D., Hume, A., MacLeod, R., Caldwell, J., Krueger, S., Tramma, L., Teviotdale, R., Abdelatif, M., Gillen, K., Ward, J., Scobbie, D., Baillie, I., Brooks, A., Prodan, B., Kerr, W., Sloan-Murphy, D., ... Jefferson, E. (2020). An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population. Giga Science, 9(10), [giaa095]. https://doi.org/10.1093/gigascience/giaa095