Free-text healthcare data and Trusted Research Environments (TREs)

Pre-conference workshop on Monday 8 June.

Description

DARE UK (Data and Analytics Research Environments UK) has recently awarded three project catalysts to explore next-generation capabilities for the use of free-text data within Trusted Research Environments (TREs).

These projects aim to develop early-stage prototypes that test new ideas to support secure data research and analytics, focusing on healthcare. Additionally, DARE UK also funds a community working group called SAFETEXT, which will develop guidelines and protocols for safe and responsible use of de-identified and synthetic healthcare free-text data in AI development.

In this full-day workshop, the project representatives will present how the needs and opportunities for the healthcare data research community and public have shaped the work. The workshop will also provide an opportunity for the wider community to feedback on the outcomes, and outline future collaboration and expectations.

Agenda

Time	Session
09:30–10:00	Registration
10:00–10:05	Welcome
10:05–10:30	Introduction from HDR UK and DARE UK – Michelle Amugi, HDR UK
10:30–10:55	STAR-TRE: Safe and trustworthy assessment of risk for sensitive free-text access – Dr Arlene Casey and Franz Gruber, University of Edinburgh
10:55–11:10	Coffee break
11:10–11:35	FORTRESS: Federated generation of free-text data – Dr Warren Del-Pinto and Prof Goran Nenadic, University of Manchester
11:35–12:00	TRExt: TRE Text Analytics – Dr Grazziela Figueredo, University of Nottingham
12:00–13:00	Lunch
13:00–13:15	SAFETEXT working group review
13:15–14:00	Break-out discussions: protocols for text de-identification and meta-data
14:00–14:15	Reports from the break-out groups
14:15–14:45	Break-out discussions: protocols for synthetic text & metadata
14:45–15:00	Reports from the break-out groups and next steps

Projects involved

STAR-TRE (Safe and Trustworthy Assessment of Risk in TREs for Sensitive Free-Text Access) will address one of the most significant gaps in secure data research: the safe use of sensitive free-text data, such as clinical notes and social care records.

By developing scalable, language-model-enabled tools and transparent risk assessment methods, the project aims to help researchers and data custodians understand when and how free-text can be used responsibly — without undermining privacy or public trust.

FORTRESS-TeHR (Federated, Open and Reliable TREs for Synthetic Textual Healthcare Records) will explore how synthetic clinical text can be generated and validated for safe use in TREs. Combining differential privacy with strong public and regulatory engagement, the project will test whether synthetic free-text data can meaningfully support research and federated learning while reducing privacy risks.

TRExt (TRE Text Analytics) will explore approaches to enable TREs to convert sensitive unstructured text into structured, analysable formats suitable for federated analytics. Using only anonymised and synthetic data, the project will build reusable pipelines that allow researchers to analyse text safely, opening up new possibilities across health, justice, and social research.

SAFETEXT (Community-led Protocols for the Safe and Responsible Use of De-identified and Synthetic Healthcare Text for AI Development) is a working group that will establish best practices to ensure trust and transparency while developing AI technologies that rely on learning from de-identified and synthetic healthcare free-text data.

Workshop organisers

Franz Gruber, University of Edinburgh
Goran Nenadic, University of Manchester
Yamiko Msosa, King’s College London
Jaya Chaturvedi, King’s College London

Extended submission	13 March 2026
~~Submission~~	~~27 February 2026~~
Notification	17 April 2026
Early registration ends	~~4 May 2026~~
(Extended to)	8 May 2026
Conference	8–10 June 2026