MedicalOS: An LLM Agent based Operating System for Digital Healthcare
- URL: http://arxiv.org/abs/2509.11507v1
- Date: Mon, 15 Sep 2025 01:43:20 GMT
- Title: MedicalOS: An LLM Agent based Operating System for Digital Healthcare
- Authors: Jared Zhu, Junde Wu,
- Abstract summary: We present textbfMedicalOS, a unified agent-based operational system designed as a domain-specific abstract layer for healthcare.<n>It translates human instructions into pre-defined digital healthcare commands, such as patient inquiry, history retrieval, exam management, report generation, referrals, treatment planning.<n>We empirically validate MedicalOS on 214 patient cases across 22 specialties, demonstrating high diagnostic accuracy and confidence.
- Score: 6.848506601405531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decades' advances in digital health technologies, such as electronic health records, have largely streamlined routine clinical processes. Yet, most these systems are still hard to learn and use: Clinicians often face the burden of managing multiple tools, repeating manual actions for each patient, navigating complicated UI trees to locate functions, and spending significant time on administration instead of caring for patients. The recent rise of large language model (LLM) based agents demonstrates exceptional capability in coding and computer operation, revealing the potential for humans to interact with operating systems and software not by direct manipulation, but by instructing agents through natural language. This shift highlights the need for an abstraction layer, an agent-computer interface, that translates human language into machine-executable commands. In digital healthcare, however, requires a more domain-specific abstractions that strictly follow trusted clinical guidelines and procedural standards to ensure safety, transparency, and compliance. To address this need, we present \textbf{MedicalOS}, a unified agent-based operational system designed as such a domain-specific abstract layer for healthcare. It translates human instructions into pre-defined digital healthcare commands, such as patient inquiry, history retrieval, exam management, report generation, referrals, treatment planning, that we wrapped as off-the-shelf tools using machine languages (e.g., Python, APIs, MCP, Linux). We empirically validate MedicalOS on 214 patient cases across 22 specialties, demonstrating high diagnostic accuracy and confidence, clinically sound examination requests, and consistent generation of structured reports and medication recommendations. These results highlight MedicalOS as a trustworthy and scalable foundation for advancing workflow automation in clinical practice.
Related papers
- A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms [0.03262230127283451]
We introduce a model driven engineering (MDE) framework designed specifically for healthcare AI.<n>The framework relies on formal metamodels, domain-specific languages, and automated transformations to move from high level specifications to running software.<n>We evaluate this approach in a multi center cancer immunotherapy study.
arXiv Detail & Related papers (2025-10-10T12:00:12Z) - A co-evolving agentic AI system for medical imaging analysis [14.925000849408683]
"TissueLab" is a co-evolving agentic AI system that allows researchers to ask direct questions, automatically plan and generate explainable results, and conduct real-time analyses.<n>By standardizing inputs, outputs, and capabilities of diverse tools, the system determines when and how to invoke them to address research and clinical questions.<n>TissueLab achieves state-of-the-art performance compared with end-to-end vision-language models (VLMs) and other agentic AI systems such as GPT-5.
arXiv Detail & Related papers (2025-09-24T16:15:28Z) - Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios [46.729092855387165]
We study the choice of the backbone LLM for medical AI agents, which is the foundation for the agent's overall reasoning and action generation.<n>Our findings demonstrate o1's ability to enhance diagnostic accuracy and consistency, paving the way for smarter, more responsive AI tools.
arXiv Detail & Related papers (2024-11-16T18:19:53Z) - Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare.
This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology [0.6397820821509177]
We introduce an alternative approach to multimodal medical AI that utilizes the generalist capabilities of a large language model (LLM) as a central reasoning engine.
This engine autonomously coordinates and deploys a set of specialized medical AI tools.
We show that the system has a high capability in employing appropriate tools (97%), drawing correct conclusions (93.6%), and providing complete (94%), and helpful (89.2%) recommendations for individual patient cases.
arXiv Detail & Related papers (2024-04-06T15:50:19Z) - Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language Models [29.05425041393475]
Generative Large Language Models (LLMs) hold significant promise in healthcare.
This study assessed the potential of LLMs to function as autonomous agents in a simulated tertiary care medical center.
arXiv Detail & Related papers (2024-01-05T15:09:57Z) - Redefining Digital Health Interfaces with Large Language Models [69.02059202720073]
Large Language Models (LLMs) have emerged as general-purpose models with the ability to process complex information.
We show how LLMs can provide a novel interface between clinicians and digital technologies.
We develop a new prognostic tool using automated machine learning.
arXiv Detail & Related papers (2023-10-05T14:18:40Z) - HEAR4Health: A blueprint for making computer audition a staple of modern
healthcare [89.8799665638295]
Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems.
Computer audition can be seen to be lagging behind, at least in terms of commercial interest.
We categorise the advances needed in four key pillars: Hear, corresponding to the cornerstone technologies needed to analyse auditory signals in real-life conditions; Earlier, for the advances needed in computational and data efficiency; Attentively, for accounting to individual differences and handling the longitudinal nature of medical data.
arXiv Detail & Related papers (2023-01-25T09:25:08Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Self-Attention Enhanced Patient Journey Understanding in Healthcare
System [43.11457142941327]
MusaNet is designed to learn the representations of patient journeys that is used to be a long sequence of activities.
The MusaNet is trained in end-to-end manner using the training data derived from EHRs.
Results have demonstrated the proposed MusaNet produces higher-quality representations than state-of-the-art baseline methods.
arXiv Detail & Related papers (2020-06-15T10:32:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.