Federating governance, access and infrastructure to support researcher use of synthetic data

Katherine O’Sullivan, Jackie Caldwell, Christian Cole, Kathy Harrison, Charlie Mayor, Antonietta Chaliou, Jaroslaw Dymiter, Stacey Dawson, Diane Brown, Katie Wilde

Research output: Contribution to journalConference articlepeer-review

13 Downloads (Pure)

Abstract

Synthetic Data has the potential to improve efficiency of data analysis for researchers. However, there is no standard approach to synthetic data governance, access controls or infrastructure requirements, and researchers may face inconsistencies in how they can access or use synthetic data across trusted research environments. We present a federated solution taken by the Scottish Safe Haven Network to address these barriers to facilitate researcher use of synthetic data.

We documented and evaluated existing governance pathways, access controls and infrastructure design for non-synthetic data across the Network, recognising uniformity and establishing equivalence using the 5 Safes framework, ISO27001 standards and the SATRE TRE specification. We also interviewed current and potential researchers using our trusted research environments to identify common use cases for accessing synthetic data. We then mapped researcher requirements against the documented equivalencies, validating with current and prospective users.

We identified several use cases: to undertake feasibility studies, to understand dataset structure and format and to write analysis code whilst waiting on the project-specific data to be provided. By mapping the use cases onto existing governance and access processes and infrastructure designs, we were able to agree to a standard application process, access control mechanism, and infrastructure platform across the Network to provide a consistent process for researchers.

A federated approach to synthetic data access will improve the speed at which research can be conducted as well as improving the transparency and consistency of data governance and access across organisations, ultimately improving the experience for researchers using TREs.
Original languageEnglish
Article number119
Number of pages2
JournalInternational Journal of Population Data Science
Volume9
Issue number5
DOIs
Publication statusPublished - 10 Sept 2024
EventInternational Population Data Linkage Conference - Chicago Fairmont, Chicago, United States
Duration: 15 Sept 202418 Sept 2024
https://ipdln.org/2024-conference/

ASJC Scopus subject areas

  • Demography
  • Information Systems
  • Health Informatics
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Federating governance, access and infrastructure to support researcher use of synthetic data'. Together they form a unique fingerprint.

Cite this