Projects per year
Abstract
Lay summary
Trusted Research Environments (TREs) are secure computing environments providing access to sensitive or personal data, such as electronic healthcare records (EHRs), for approved research purposes. With the increasing application of Artificial Intelligence (AI) to patient data, new challenges arise on how to protect individuals’ data from disclosure risks. Here, we present a comprehensive guide to the types of risks different AI methods pose in terms of disclosure and privacy. In many cases, risks can be minimised or mitigated following specific strategies. Preventing release of personal and sensitive data must include everyone involved throughout the project lifecycle from the data controller, the model development team, the model owner, to the product users.
Abstract
There is substantial demand to use real-world data to inform improvements in the whole patient care cycle. EHRs and other types of personal data, are crucial to these developments and due to the size and complexity of the data AI techniques are increasingly being investigated. An early review and experience on working with TREs during the GRAIMatter and SACRO projects and within the SDC Reboot Community Interest Group shows an urge from TREs to better understand where the AI risks are and the corresponding mitigation strategies in terms of disclosure control.
We present a taxonomy of AI risks and associated mitigations to help TREs, and project leads, to support their responsibilities in ensuring data privacy. The main objective is to provide an efficient comprehensive, agnostic, and scalable guide to assess the privacy risk of AI projects using sensitive data in TREs and apply mitigations. This TRE-relevant approach groups AI models according to the type of output they produce, rather than the algorithm used to train the model. This is a counterpart to the Statbarn taxonomy for ‘traditional outputs’, following an equivalent process.
Thus, when faced with a project proposal with a release request for new type of AI model, the TRE staff need only to agree with the project team which of a small number of groups the model in question falls into. The rest of the disclosure control process or risk assessment and mitigation flows from there. For example, instance-based models store data. There are two possible mitigations for this risk: 1) anonymise the dataset used to train the model, and remove vectors where possible; 2) release such risky models only via a Model Query Control (MQC) system with controlled access for trusted users, mitigating the likelihood of uncontrolled data leakage. Other groups of models, however, have associated risks that can only be estimated by simulating attacks. SACRO-ML and similar tools can help to demonstrate that no individuals’ data can be identified either fully or partially.
Trusted Research Environments (TREs) are secure computing environments providing access to sensitive or personal data, such as electronic healthcare records (EHRs), for approved research purposes. With the increasing application of Artificial Intelligence (AI) to patient data, new challenges arise on how to protect individuals’ data from disclosure risks. Here, we present a comprehensive guide to the types of risks different AI methods pose in terms of disclosure and privacy. In many cases, risks can be minimised or mitigated following specific strategies. Preventing release of personal and sensitive data must include everyone involved throughout the project lifecycle from the data controller, the model development team, the model owner, to the product users.
Abstract
There is substantial demand to use real-world data to inform improvements in the whole patient care cycle. EHRs and other types of personal data, are crucial to these developments and due to the size and complexity of the data AI techniques are increasingly being investigated. An early review and experience on working with TREs during the GRAIMatter and SACRO projects and within the SDC Reboot Community Interest Group shows an urge from TREs to better understand where the AI risks are and the corresponding mitigation strategies in terms of disclosure control.
We present a taxonomy of AI risks and associated mitigations to help TREs, and project leads, to support their responsibilities in ensuring data privacy. The main objective is to provide an efficient comprehensive, agnostic, and scalable guide to assess the privacy risk of AI projects using sensitive data in TREs and apply mitigations. This TRE-relevant approach groups AI models according to the type of output they produce, rather than the algorithm used to train the model. This is a counterpart to the Statbarn taxonomy for ‘traditional outputs’, following an equivalent process.
Thus, when faced with a project proposal with a release request for new type of AI model, the TRE staff need only to agree with the project team which of a small number of groups the model in question falls into. The rest of the disclosure control process or risk assessment and mitigation flows from there. For example, instance-based models store data. There are two possible mitigations for this risk: 1) anonymise the dataset used to train the model, and remove vectors where possible; 2) release such risky models only via a Model Query Control (MQC) system with controlled access for trusted users, mitigating the likelihood of uncontrolled data leakage. Other groups of models, however, have associated risks that can only be estimated by simulating attacks. SACRO-ML and similar tools can help to demonstrate that no individuals’ data can be identified either fully or partially.
| Original language | English |
|---|---|
| Publication status | Published - 15 Oct 2025 |
| Event | HDR UK Conference 2025 - Scottish Exhibition Centre, Glasgow, United Kingdom Duration: 15 Oct 2025 → 16 Oct 2025 https://www.hdruk.ac.uk/about/hdr-uk-conference/ |
Conference
| Conference | HDR UK Conference 2025 |
|---|---|
| Country/Territory | United Kingdom |
| City | Glasgow |
| Period | 15/10/25 → 16/10/25 |
| Internet address |
Keywords
- TRE
- RELEASE-AI
- AI
- artificial intelligence
- Machine learning
- Sensitive data
- framework
ASJC Scopus subject areas
- Computer Science(all)
- Health Informatics
Fingerprint
Dive into the research topics of 'Practical guide to Artificial Intelligence risks and mitigations for Trusted Research Environments and the RELEASE-AI framework'. Together they form a unique fingerprint.Projects
- 1 Active
-
RELEASE-AI: Protecting Sensitive Data Across The AI Lifecycle Disclosure Risks and Mitigations in Trusted Research Environments
Crespi Boixader, A., Li, S., Liley, J., Ward, L., Cole, C. & Smith, J., 23 Sept 2025.Research output: Contribution to conference › Poster
Open AccessFile -
SACRO: Semi-Automated Checking of Research Outputs
Cole, C., Smith, J., Albashir, M., Bacon, S., Butler Cole, B., Caldwell, J., Crespi Boixader, A., Green, E., Jefferson, E., Jones, Y., Krueger, S., Liley, J., McNeill, A., O'Sullivan, K., Oldfield, K., Preen, R., Robinson, L., Rogers, S., Stokes, P. & Tilbrok, A. & 1 others, , 6 Nov 2023Research output: Book/Report › Other report
-
GRAIMatter: Guidelines and Resources for AI Model Access from TrusTEd Research environments
Jefferson, E., Cole, C., Crespi Boixader, A., Rogers, S., Roche, M., Ritchie, F., Smith, J., Tava, F., Daly, A., Beggs, J. & Chuter, A., 25 Aug 2022, Conference Proceedings for International Population Data Linkage Conference 2022. 3 ed. Vol. 7.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
Open Access