Algorithmic Foundations for
Medical AI in the Real World

Workshop at the International Conference on Machine Learning

ICML 2026 — Date & Venue TBD

About

Artificial intelligence (AI), including predictive and generative foundation models, is being implemented in clinical settings worldwide at an unprecedented rate. As of May 2025, at least 377 healthcare systems in the U.S. alone have piloted or adopted 70 generative AI tools for clinical decision support, patient communication, documentation, claims processing, and healthcare administration. Globally, 48% of clinicians surveyed across 109 countries report using AI for work.

This rapid adoption is driven by the strong technical performance of new AI models across several largely synthetic medical benchmarks. However, concerns about coherence, accuracy, hallucinations, and bias persist. The real-world clinical performance of AI at the healthcare system-level remains poorly understood, and algorithmic strategies to evaluate and improve deployed medical AI models remain underdeveloped.

This workshop addresses that gap by focusing on the algorithmic frontier of post-deployment improvement for medical AI. Moving beyond conventional supervised or reinforcement fine-tuning, we explore an emerging design space where deployment-time diagnostics—from human expert critique to real-world performance shifts—are utilized as a signal for model improvement.

We emphasize medical applications because improvement must operate under unusually stringent constraints—including high-risk decision-making, delayed or noisy feedback, heterogeneous populations, and safety and equity requirements—which make post-deployment updates both critical and challenging. While grounded in healthcare, the algorithms discussed in this workshop are general and will be broadly relevant to ICML researchers interested in deploying their models into high-stakes environments.

Topics

We invite contributions on the following topics at the intersection of algorithmic research and deployed medical AI.

Model & Data-Efficient Learning

Parameter-efficient fine-tuning, model editing, synthetic data generation, curriculum learning, lifelong learning, and other meta-learning strategies.

Test-Time Scaling

Self-consistency decoding, verification-guided reasoning, and other methods to allocate computation by risk, uncertainty, or need in resource-constrained healthcare environments.

Self-Evolving Agents

Autonomously improving systems capable of adaptive prompt or data sequencing, self-critique or revision, generation of new tasks from failures, updating of long-term memory modules, or self-modification of tools or agent architecture.

Human-AI Collaboration

Human feedback loops, clinical interpretability, and abstention strategies for safe and trustworthy AI-assisted clinical decision-making.

Deployment Science

Generalization under distribution shifts and domain adaptation, including monitoring, distillation, and quantization for edge and low-resource settings.

Call for Papers

We invite submissions that advance the algorithmic foundations of deployed medical AI. To encourage submissions of both current progress and forward-looking discussion, we welcome two types of submissions:

Original Research Papers

Up to 5 pages (excluding references & appendix)

Novel contributions to post-deployment improvement, test-time scaling, self-evolving agents, human-AI collaboration, or deployment science for medical AI.

Perspective Pieces

Up to 2 pages

Forward-looking position papers that outline challenges, propose research agendas, or spark community discussion on medical AI deployment.

All accepted papers will be non-archival. One round of double-blinded peer review will be performed via OpenReview.

Important Dates

Submission Deadline:TBD
Notification of Acceptance:TBD
Camera-Ready Deadline:TBD
Workshop Date:TBD — ICML 2026

Interactive Challenge

Virtual Clinic

A key barrier to post-deployment medical AI research is the lack of safe, interactive environments where algorithms can be tested under realistic uncertainty. Static benchmarks fail to capture the sequential, partially observed, and high-stakes nature of clinical decision making.

The Virtual Clinic is a public clinical world model designed specifically for this workshop—an experimental sandbox to study algorithmic adaptation under delayed and incomplete feedback. The environment consists of LLM-based simulated patient agents powered by realistic underlying health records generated using Synthea. Participants deploy frontier LLMs that interact with these simulated patient agents through API-driven multi-turn conversations.

We will select up to 20 teams to participate. Each team will receive a fixed budget of frontier LLM queries (1,000 per team; all compute costs covered) to conduct experiments within the simulation. Teams can modify system prompts, questioning strategies, context management, and tool-calling policies to identify algorithms that improve performance under partial observability.

All teams will submit short papers (up to 2 pages) using a provided template to describe their findings. Teams will also present posters at the workshop, and up to 3 outstanding submissions will be invited to present highlights during the workshop program.

Diagnosis Prediction

Interview a simulated patient to propose a diagnosis. Ground truth labels are derived from the patient's Synthea-generated longitudinal health record.

Treatment Prediction

Interview a simulated patient to predict the treatment plan that will be administered.

Event Prediction

Interview a simulated patient to estimate the probability of a specific clinical event (e.g., hospitalization or complication) within a fixed time horizon.

Learn More

Schedule

This one-day workshop will feature 6 invited talks, 3 contributed talks, 1 panel discussion, 1 Virtual Clinic highlights session, and 2 poster sessions.

Morning

08:00 – 08:15Opening Remarks
08:15 – 10:00Invited Talks (3 speakers)
10:00 – 10:45Poster Session I
10:45 – 12:00Contributed Talks (3 speakers)
12:00 – 13:00Lunch Break

Afternoon

13:00 – 14:45Invited Talks (3 speakers)
14:45 – 15:15Virtual Clinic Highlights
15:15 – 16:00Global Health Panel
16:00 – 16:45Poster Session II
16:45 – 17:00Closing Remarks

Invited Speakers & Panelists

Experts with real-world experience deploying AI at healthcare system scale.

Karandeep Singh

Karandeep Singh

UC San Diego Health

Chief Health AI Officer and Associate Professor

Bilal Mateen

Bilal Mateen

PATH; University of Birmingham

Chief AI Officer; Professor

Robert Korom

Robert Korom

Penda Health

Chief Medical Officer

Kim Branson

Kim Branson

GlaxoSmithKline

SVP and Global Head of AI and Machine Learning

Maryam Mustafa

Maryam Mustafa

Lahore University of Management Sciences

Professor; Founder, Awaaz-e-Sehat

Melissa Miles

Melissa Miles

Gates Foundation

Senior Program Officer, AI and Innovations in Primary Health Care

Lorenzo Righetto

Lorenzo Righetto

Nature Health

Senior Editor

Workshop Organizers

Ayush Noori

Ayush Noori

University of Oxford

ayush.noori@sjc.ox.ac.uk

PhD student, Rhodes Scholar, and Encode: AI for Science Fellow at the University of Oxford. His research advances AI for diagnosis and treatment of neurological disorders. Published over 40 papers, including 14 first or co-first works. Served on the organizing committee of the Machine Learning for Health Symposium for two years.

Marinka Zitnik

Marinka Zitnik

Harvard University

marinka@hms.harvard.edu

Associate Professor of Biomedical Informatics at Harvard with appointments at the Broad Institute and Kempner Institute. Her research focuses on AI for medicine and therapeutic discovery. Co-organized 20+ international workshops at NeurIPS, ICML, and ICLR, including the NeurIPS AI for Science series and workshops on foundation models for biology.

Emily Alsentzer

Emily Alsentzer

Stanford University

ealsentzer@stanford.edu

Assistant Professor of Biomedical Data Science and Computer Science at Stanford University. Her research uses machine learning to augment clinical decision making and broaden access to healthcare. Founding organizer of the Symposium on AI for Learning Health Systems (SAIL) and former General Chair of the Machine Learning for Health Symposium.

David A. Clifton

David A. Clifton

University of Oxford

david.clifton@eng.ox.ac.uk

Royal Academy of Engineering Chair of Clinical Machine Learning and NIHR Research Professor at the University of Oxford. His research focuses on real-world deployment of in-hospital AI systems and translation into low- and middle-income countries. Co-organized workshops at ICML 2022, AAAI 2025, and ICCV 2025, among others.

Marie-Laure Charpignon

Marie-Laure Charpignon

Kaiser Permanente, UC Berkeley, Boston Children's Hospital

mariecharpignon@berkeley.edu

Postdoctoral fellow in computational health informatics. Received her PhD in Social & Engineering Systems and Statistics from MIT. Her research focuses on causal inference and network science in public health. Co-organized the ICLR 2021 AI for Public Health workshop and co-chairs the KDD epiDAMIK workshop.

Lucas Vittor

Lucas Vittor

Harvard Medical School

lucas_vittor@hms.harvard.edu

Research Associate in Biomedical Informatics at Harvard Medical School. Designs and deploys pipelines for large language models and multi-agent AI systems. Previously an early engineer at Mutt Data, building ML architectures for industry leaders. Research Engineer providing support for the Virtual Clinic.

Special Session

Global Health Panel

There are unique opportunities and challenges for real-world medical AI in low- and middle-income countries (LMICs). For example, LMIC health AI deployments often encounter extreme distribution shift; must operate with intermittent connectivity and limited compute, necessitating distillation, quantization, or low-bitwidth inference; and often rely on sparse or noisy human feedback when clinical expertise is limited. Recognizing these differences, our workshop is committed to a global perspective. This commitment is reflected through a diverse organizing committee and speaker list that draws from five continents.

This panel will feature leaders in global health delivery and AI deployment who will outline specific priority global health application areas for AI, such as autonomous triage, maternal health risk prediction, and AI-assisted clinical documentation. We will actively invite frontline implementation partners from LMICs to participate, ensuring that technical discussions remain grounded in realities of the Global South.

Gates Foundation

Pioneering AI for primary healthcare in Africa

University of Global Health Equity

Global health education

Partners In Health

Quality healthcare in low-resource settings

Nature Health

Research impacting health policy and practice

FAQs

For any questions not covered here, please contact marinka@hms.harvard.edu or ayush.noori@sjc.ox.ac.uk.