Abstract
Background
Early identification of individuals at risk of dementia is essential for preventive care and timely enrolment into disease-modifying interventions. However, most existing prediction approaches rely on invasive, costly, or research-only biomarkers that are not scalable within public healthcare systems. Routinely acquired National Health Service (NHS) brain magnetic resonance imaging (MRI) scans, when linked with electronic health records, represent a widely available and privacy-preserving resource for population-level dementia risk stratification. A key challenge for clinical translation is ensuring that machine-learning predictions are reliable, interpretable, and safe to apply, particularly when models are used years before clinical diagnosis.
Methods
We conducted a retrospective case-control study entirely within a secure NHS Trusted Research Environment using routine T1-weighted brain MRI scans linked to electronic health records from Tayside and Fife, Scotland. The study included 518 participants: 259 individuals who subsequently developed dementia and 259 age- and sex-matched controls. Structural brain features were derived from MRI data and analysed using a support-vector-machine classifier with nested cross-validation to minimise overfitting. Prediction confidence was quantified using distance-from-hyperplane (DFH) calibration, enabling stratification of model outputs by certainty. Primary outcomes were classification accuracy and area under the receiver-operating-characteristic curve (AUC). Secondary analyses examined DFH-stratified performance and the relationship between prediction accuracy and time from scan to first recorded dementia diagnosis.
Results
The model predicted future dementia up to five years before first recorded NHS diagnosis with an AUC of 0.71, a performance consistent with real-world clinical imaging rather than research-optimised datasets. Model sensitivity increased for scans acquired closer to diagnosis, indicating stronger predictive signal as disease onset approached. Confidence-based stratification identified a high-confidence subgroup comprising approximately 35% of scans, within which prediction accuracy increased to around 80%. Performance was consistent across heterogeneous routine NHS scanners and imaging protocols, demonstrating robustness and generalisability to real-world clinical data rather than research-optimised acquisitions.
Conclusion
Routinely collected NHS brain MRI data can be used to predict future dementia several years before clinical diagnosis. Incorporating confidence calibration transforms a conventional machine-learning classifier into a safety-aware and clinically interpretable framework by enabling selective use of high-certainty predictions. This approach supports scalable early detection, population-level risk stratification, and targeted recruitment into preventive or disease-modifying clinical trials, with clear potential for integration into public health systems.
Early identification of individuals at risk of dementia is essential for preventive care and timely enrolment into disease-modifying interventions. However, most existing prediction approaches rely on invasive, costly, or research-only biomarkers that are not scalable within public healthcare systems. Routinely acquired National Health Service (NHS) brain magnetic resonance imaging (MRI) scans, when linked with electronic health records, represent a widely available and privacy-preserving resource for population-level dementia risk stratification. A key challenge for clinical translation is ensuring that machine-learning predictions are reliable, interpretable, and safe to apply, particularly when models are used years before clinical diagnosis.
Methods
We conducted a retrospective case-control study entirely within a secure NHS Trusted Research Environment using routine T1-weighted brain MRI scans linked to electronic health records from Tayside and Fife, Scotland. The study included 518 participants: 259 individuals who subsequently developed dementia and 259 age- and sex-matched controls. Structural brain features were derived from MRI data and analysed using a support-vector-machine classifier with nested cross-validation to minimise overfitting. Prediction confidence was quantified using distance-from-hyperplane (DFH) calibration, enabling stratification of model outputs by certainty. Primary outcomes were classification accuracy and area under the receiver-operating-characteristic curve (AUC). Secondary analyses examined DFH-stratified performance and the relationship between prediction accuracy and time from scan to first recorded dementia diagnosis.
Results
The model predicted future dementia up to five years before first recorded NHS diagnosis with an AUC of 0.71, a performance consistent with real-world clinical imaging rather than research-optimised datasets. Model sensitivity increased for scans acquired closer to diagnosis, indicating stronger predictive signal as disease onset approached. Confidence-based stratification identified a high-confidence subgroup comprising approximately 35% of scans, within which prediction accuracy increased to around 80%. Performance was consistent across heterogeneous routine NHS scanners and imaging protocols, demonstrating robustness and generalisability to real-world clinical data rather than research-optimised acquisitions.
Conclusion
Routinely collected NHS brain MRI data can be used to predict future dementia several years before clinical diagnosis. Incorporating confidence calibration transforms a conventional machine-learning classifier into a safety-aware and clinically interpretable framework by enabling selective use of high-certainty predictions. This approach supports scalable early detection, population-level risk stratification, and targeted recruitment into preventive or disease-modifying clinical trials, with clear potential for integration into public health systems.
| Original language | English |
|---|---|
| Journal | Alzheimer's Research and Therapy |
| Publication status | Accepted/In press - 8 May 2026 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Dementia risk prediction
- Magnetic resonance imaging (MRI)
- Machine learning
- Early detection
- Population health
Fingerprint
Dive into the research topics of 'Predicting future dementia from routine clinical MRI and linked healthcare data'. Together they form a unique fingerprint.Projects
- 1 Finished
-
MICA: InterdisciPlInary Collaboration for EfficienT and Effective Use of Clinical Images in Big Data Health Care RESearch: PICTURES (Programme Grant) (Joint with Universities of Edinburgh and Abertay)
Doney, A. (Investigator), Jefferson, E. (Investigator), Palmer, C. (Investigator), Steele, D. (Investigator), Trucco, M. (Investigator) & Wang, H. (Investigator)
1/08/19 → 28/02/25
Project: Research
Research output
- 1 Software
-
Dementia Cohort Identification and SVM Classifier Pipeline for article ‘Machine learning-based prediction of future dementia using routine clinical MRI brain scans and healthcare data’
Reel, P. S. (Creator), Al-Wasity, S. (Creator), Edwards, C. (Creator), Reel, S. (Creator), Mansouri-Benssassi, E. (Creator), Suveges, S. (Creator), Mookiah, M. R. K. (Creator), Krueger, S. (Creator), Trucco, M. (Creator), Jefferson, E. (Creator), Doney, A. (Creator) & Steele, D. (Creator), 29 Oct 2025Research output: Non-textual form › Software
Open Access
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver