Hospital administrative operations represent one of the most demanding components of modern healthcare infrastructure. Large hospitals must process over 10,000 administrative requests daily — spanning patient intake, appointment scheduling, insurance verification, and inter-departmental coordination — yet existing benchmarks focus narrowly on patient-physician dialogues or isolated subtasks, leaving administrative complexity largely unstudied.
We present H-AdminSim, a multi-agent simulation framework for modeling realistic hospital administrative workflows at scale. H-AdminSim synthesizes care level-specific patient profiles across primary, secondary, and tertiary hospitals using 194 disease-symptom pairs spanning 9 internal medicine departments, then simulates multi-turn interactions between LLM-driven administrative staff agents and first-visit outpatient agents. Optional FHIR R5 integration enables standardized interoperability with real hospital information systems. We evaluate 5 diverse LLM backends using detailed rubric-based scoring, establishing H-AdminSim as a principled testbed for systematic evaluation of LLM-driven administrative automation.
A complete pipeline covering the full first-visit outpatient journey — from patient data synthesis, multi-agent administrative workflow simulation to quantitative evaluation — the first framework to model end-to-end first-visit hospital administrative operations at scale.
Care level-specific patient profile synthesis using 194 disease-symptom pairs across 9 internal medicine departments and 3 hospital care levels, with configurable patient demographics and preference types.
Optional FHIR R5-compatible output for deployment in real hospital information systems, with multi-backend LLM support (GPT, Gemini, open-source via vLLM) and concurrent simulation capabilities.
Beyond the built-in 194 disease-symptom pairs, users can supply their own patient data and disease profiles, enabling simulation in custom hospital settings and specialty domains outside the default 9 internal medicine departments.
Users can configure simulation-level parameters such as the simulation period, number of patients, and scheduling constraints, enabling controlled experiments and stress-testing under diverse operational conditions.
H-AdminSim generates realistic patient profiles tailored to each care level of the healthcare institution. A hierarchical synthesis pipeline constructs a virtual hospital setting — including time system, departments, physicians, and patient profiles — using 194 disease-symptom pairs spanning 9 internal medicine departments.
In addition to the built-in default dataset, user-defined disease-symptom pairs and departments data are also supported, enabling simulation in custom hospital settings or specialty domains.
The framework simulates two core administrative tasks — patient intake and appointment scheduling — between LLM-driven staff agents and first-visit outpatient agents through dialogue-based interactions. A virtual time flow is implemented to support time-dependent tasks such as scheduling, enabling realistic simulation of temporal workflows.
H-AdminSim is model-agnostic and supports multiple LLM backends, enabling systematic cross-model benchmarking. Different models can be assigned per agent role, with support for GPT and Gemini series models as well as open-source models via vLLM.
Each simulation run is evaluated using detailed rubrics that assess LLM performance across key administrative dimensions, enabling reproducible and standardized comparison across models and hospital care levels.
Evaluates the accuracy of the staff agent in department assignment, patient information extraction, and data structuring, as well as the fidelity of the patient agent in faithfully simulating the assigned patient profile.
Correctness of scheduled appointments relative to patient preference type and available FHIR Slot resources, evaluated separately for tool-calling and pure LLM reasoning-based strategies.
Multi-turn dialogue assessed by human evaluators across four criteria: naturalness, response appropriateness, dialogue flow, and overall quality (avg. 4.11 / 5).
H-AdminSim is under active development. Future releases aim to extend the simulation scope within the MEdWorld platform, including a dedicated patient simulator with configurable personas and support for follow-up visit workflows beyond the current first-visit scope.
H-AdminSim is available as a Python package on PyPI. Install it with pip to start simulating hospital administrative workflows immediately.
| Property | Value |
|---|---|
| Package | h-adminsim |
| Version | 1.2.2 |
| Python | ≥3.11, <3.13 |
| License | Apache 2.0 |
| Author | Jun-Min Lee |
Install the package and all dependencies with a single command.
pip install h-adminsim
Try H-AdminSim interactively. Configure staff and patient counts, provide an OpenAI API key, and watch the multi-agent administrative simulation unfold in real time.
Launch Demo@inproceedings{lee2026hadminsimmultiagentsimulatorrealistic,
title = {H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration},
author = {Jun-Min Lee and Meong Hi Son and Edward Choi},
booktitle = {Proceedings of the Conference on Health, Inference, and Learning (CHIL)},
year = {2026},
eprint = {2602.05407},
archivePrefix = {arXiv},
primaryClass = {cs.AI},
url = {https://arxiv.org/abs/2602.05407},
}