Abstract
With the rise in machine-learning based medical research activities increasing year-on-year, the demand for large-scale DICOM datasets has never been higher. Due to the complex and flexible structure of the DICOM standard, anonymisation of DICOM metadata is a cumbersome task. The Research Data Management Platform (RDMP) is a free & open-source software application centred around the extraction and anonymisation of data from primary care systems, such as PACS, into reproducible, auditable, and pseudonymous cohorts for research purposes.
Originally designed for tabular data, RDMP has been extended to support the extraction and anonymisation of DICOM tags for research purposes. Built upon the principles of Extract, Transform & Load (ETL), RDMP provides the ability to construct pipelines for the extraction of data from clinical systems into a central data warehouse. From this central data warehouse, users can explore potential cohorts and extract anonymised data subsets into their research environment. These cohorts and extracted data subsets are repeatable, versionable and reusable through RDMPs modular ETL pipeline system.
The DICOM processing extension for RDMP supports the routing of specified DICOM tags from source images into a tabular format within the central data warehouse. This facilitates cohort exploration and creation using any combination of tags from within the DICOM standard alongside traditional cohort building techniques.
RDMP has been used to facilitate hundreds of research projects within the safe-haven network across several data modalities since 2013. RDMP’s DICOM processing functionality has been proven to be able to process thousands of DICOMs a minute, reducing the turnaround time for DICOM based research data subsets by several order of magnitudes.
Originally designed for tabular data, RDMP has been extended to support the extraction and anonymisation of DICOM tags for research purposes. Built upon the principles of Extract, Transform & Load (ETL), RDMP provides the ability to construct pipelines for the extraction of data from clinical systems into a central data warehouse. From this central data warehouse, users can explore potential cohorts and extract anonymised data subsets into their research environment. These cohorts and extracted data subsets are repeatable, versionable and reusable through RDMPs modular ETL pipeline system.
The DICOM processing extension for RDMP supports the routing of specified DICOM tags from source images into a tabular format within the central data warehouse. This facilitates cohort exploration and creation using any combination of tags from within the DICOM standard alongside traditional cohort building techniques.
RDMP has been used to facilitate hundreds of research projects within the safe-haven network across several data modalities since 2013. RDMP’s DICOM processing functionality has been proven to be able to process thousands of DICOMs a minute, reducing the turnaround time for DICOM based research data subsets by several order of magnitudes.
| Original language | English |
|---|---|
| Publication status | Published - 12 Jun 2024 |
| Event | SINAPSE 2024 ASM - University of Stirling, Stirling, United Kingdom Duration: 12 Jun 2024 → 12 Jun 2024 https://www.sinapse.ac.uk/events/2024-sinapse-asm/ |
Conference
| Conference | SINAPSE 2024 ASM |
|---|---|
| Country/Territory | United Kingdom |
| City | Stirling |
| Period | 12/06/24 → 12/06/24 |
| Internet address |
Fingerprint
Dive into the research topics of 'High-Speed Preparation of DICOM Metadata for Research Purposes'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver