Pre-conference workshop on Monday 8 June.

Description

DARE UK (Data and Analytics Research Environments UK) has recently awarded three project catalysts to explore next-generation capabilities for the use of free-text data within Trusted Research Environments (TREs).

These projects aim to develop early-stage prototypes that test new ideas to support secure data research and analytics, focusing on healthcare. Additionally, DARE UK also funds a community working group called SAFETEXT, which will develop guidelines and protocols for safe and responsible use of de-identified and synthetic healthcare free-text data in AI development.

In this full-day workshop, the project representatives will present how the needs and opportunities for the healthcare data research community and public have shaped the work. The workshop will also provide an opportunity for the wider community to feedback on the outcomes, and outline future collaboration and expectations.

Preliminary agenda

Time Session
09:30–10:00 Registration
10:00–10:05 Welcome
10:05–10:30 Introduction from HDR UK and DARE UK
10:30–10:55 STAR-TRE: Safe and trustworthy assessment of risk for sensitive free-text access
10:55–11:10 Coffee break
11:10–11:35 FORTRESS: Federated generation of free-text data
11:35–12:00 TRExt: TRE Text Analytics
12:00–13:00 Lunch
13:00–13:15 SAFETEXT working group review
13:15–14:00 Break-out discussions: protocols for text de-identification and meta-data
14:00–14:15 Coffee break
14:15–14:45 Break-out discussions: protocols for synthetic text & metadata
14:45–15:00 Feedback and summary

Projects involved

STAR-TRE logo

STAR-TRE (Safe and Trustworthy Assessment of Risk in TREs for Sensitive Free-Text Access) will address one of the most significant gaps in secure data research: the safe use of sensitive free-text data, such as clinical notes and social care records.

By developing scalable, language-model-enabled tools and transparent risk assessment methods, the project aims to help researchers and data custodians understand when and how free-text can be used responsibly — without undermining privacy or public trust.

FORTRESS logo

FORTRESS-TeHR (Federated, Open and Reliable TREs for Synthetic Textual Healthcare Records) will explore how synthetic clinical text can be generated and validated for safe use in TREs. Combining differential privacy with strong public and regulatory engagement, the project will test whether synthetic free-text data can meaningfully support research and federated learning while reducing privacy risks.

TRExt logo

TRExt (TRE Text Analytics) will explore approaches to enable TREs to convert sensitive unstructured text into structured, analysable formats suitable for federated analytics. Using only anonymised and synthetic data, the project will build reusable pipelines that allow researchers to analyse text safely, opening up new possibilities across health, justice, and social research.

SAFETEXT logo

SAFETEXT (Community-led Protocols for the Safe and Responsible Use of De-identified and Synthetic Healthcare Text for AI Development) is a working group that will establish best practices to ensure trust and transparency while developing AI technologies that rely on learning from de-identified and synthetic healthcare free-text data.

Workshop organisers

  • Franz Gruber, University of Edinburgh
  • Goran Nenadic, University of Manchester
  • Yamiko Msosa, King’s College London
  • Jaya Chaturvedi, King’s College London