FORMA:
Formal Reasoning and Modelling Agent

An Application to Clinical Trials

Cyrus Z Zhou1, Yufei Jin2, Yilin Xu3, Chieh-Ju Chao1,4, Monica S Lam1
1 Stanford University 2 UCLA 3 JHU 4 Mayo Clinic
Stanford-OVAL
Project Website · Last updated: Nov 2025

News & Updates

System overview or key result figure

Abstract

Effective patient-trial matching lies at the heart of clinical research. Despite spending approximately $1.9 billion annually on recruitment, nearly 80% of clinical trials still fail to meet their enrollment targets. Errors in patient-trial matching are high-stakes on both recall and precision: false negatives deny patients from potentially life-saving options, while false positives steer patients toward infeasible or unsafe trials. Yet current pipelines leveraging embedding-based retrieval are optimized to minimize aggregate retrieval loss rather than eliminate mismatches for each patient-trial pair, making them structurally ill-suited to this safety-critical task.

We present TrialFORMA (FOmal Reasoning and Modeling Agent), a system for patient–trial matching that grounds patient records and eligibility criteria in a shared formal SMT (Satisfiability Modulo Theories) representation. Unlike recent approaches that merely minimize error rates, TrialFORMA is designed around an exactness objective, aiming—within its formal representation—to eliminate both false negatives and false positives.

Attempting to represent both patient records and trial eligibility criteria in SMT confronts us with several challenges: (1) imperfect semantic parsing from natural language into SMT predicates; (2) substantial variation in concept names and surface expressions across trials and patient records; and (3) incomplete clinical ontologies that omit relevant concepts and relations.

To address these challenges, we develop a family of agentic workflows that (1) semantically parse natural language into SMT; (2) perform concept canonicalization and expression schematization; and (3) bridge missing links in the underlying clinical ontologies.

Preliminary experiments on standard patient–trial matching benchmarks show that TrialFORMA achieves near-perfect recall at high precision, while providing clause-level auditability of eligibility decisions. These results establish symbolic, logic-based matching as an accurate and transparent foundation for high-stake clinical eligibility workflows.

Key capabilities

Results

We evaluate TRIAL-FORMA on a held-out set of patient–trial pairs with expert labels. Our system improves strict eligibility classification compared to a strong embedding-based baseline while providing interpretable reasoning traces.

Recall@K SIGIR TREC2022
TrialGPT (embedding-based) ## ##
TRIAL-FORMA (ours) ## ##

TrialFORMA can deliver real-time matching with almost no LLM inference costs

Resources

Paper: arXiv:xxxx.xxxxx

Dataset: SIGIR 2016 corpus , TREC Clinical Trials 2021 corpus, TREC Clinical Trials 2022 corpus

Contact: zikai@stanford.edu

BibTeX

@software{forma_clinical_trial_matching,
        title = {FORMA: Formal Reasoning and Modelling Agent},
        subtitle = {An Application to Clinical Trials},
        author = {Cyrus Zhou, Yufei Jin, Yilin Xu, Chieh-Ju Chao, Monica S Lam},
        year = {2025},
        url = {https://github.com/stanford-oval/trial-forma}
      }