Related papers: Dr.Copilot: A Multi-Agent Prompt Optimized Assistant for Improving Patient-Doctor Communication in Romanian

Dr.Copilot: A Multi-Agent Prompt Optimized Assistant for Improving Patient-Doctor Communication in Romanian

URL: http://arxiv.org/abs/2507.11299v2
Date: Sun, 20 Jul 2025 15:15:56 GMT
Title: Dr.Copilot: A Multi-Agent Prompt Optimized Assistant for Improving Patient-Doctor Communication in Romanian
Authors: Andrei Niculae, Adrian Cosma, Cosmin Dumitrache, Emilian Rǎdoi,
Abstract summary: Dr. Copilot is a multi-agent large language model (LLM) system that supports Romanian-speaking doctors.<n>Rather than assessing medical correctness, Dr. Copilot provides feedback along 17 interpretable axes.<n> Empirical evaluations and live deployment with 41 doctors show measurable improvements in user reviews and response quality.
Score: 3.3311266423308252
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Text-based telemedicine has become increasingly common, yet the quality of medical advice in doctor-patient interactions is often judged more on how advice is communicated rather than its clinical accuracy. To address this, we introduce Dr. Copilot , a multi-agent large language model (LLM) system that supports Romanian-speaking doctors by evaluating and enhancing the presentation quality of their written responses. Rather than assessing medical correctness, Dr. Copilot provides feedback along 17 interpretable axes. The system comprises of three LLM agents with prompts automatically optimized via DSPy. Designed with low-resource Romanian data and deployed using open-weight models, it delivers real-time specific feedback to doctors within a telemedicine platform. Empirical evaluations and live deployment with 41 doctors show measurable improvements in user reviews and response quality, marking one of the first real-world deployments of LLMs in Romanian medical settings.

Related papers

DoPI: Doctor-like Proactive Interrogation LLM for Traditional Chinese Medicine [2.650034302431857]
Current large language models (LLMs) exhibit notable limitations in medical applications.<n>We propose DoPI, a novel LLM system specifically designed for the Traditional Chinese Medicine (TCM) domain.<n>The guidance model conducts multi-turn dialogues with patients and dynamically generates questions based on a knowledge graph.<n>The expert model leverages deep TCM expertise to provide final diagnoses and treatment plans.
arXiv Detail & Related papers (2025-07-07T11:04:03Z)
3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark [0.29987253996125257]
3MDBench is an open-source framework for simulating and evaluating LVLM-driven telemedical consultations.<n> multimodal dialogue with internal reasoning improves F1 score by 6.5% over non-dialogue settings.<n> injecting predictions from a diagnostic convolutional network into the LVLM's context boosts F1 by up to 20%.
arXiv Detail & Related papers (2025-03-26T07:32:05Z)
Natural Language-Assisted Multi-modal Medication Recommendation [97.07805345563348]
We introduce the Natural Language-Assisted Multi-modal Medication Recommendation(NLA-MMR)<n>The NLA-MMR is a multi-modal alignment framework designed to learn knowledge from the patient view and medication view jointly.<n>In this vein, we employ pretrained language models(PLMs) to extract in-domain knowledge regarding patients and medications.
arXiv Detail & Related papers (2025-01-13T09:51:50Z)
IMAS: A Comprehensive Agentic Approach to Rural Healthcare Delivery [0.0]
This paper proposes an advanced agentic medical assistant system designed to improve healthcare delivery in rural areas. The system is composed of five crucial components: translation, medical complexity assessment, expert network integration, and response simplification. Evaluation results using the MedQA, PubMedQA, and JAMA datasets demonstrate that this integrated approach significantly enhances the effectiveness of rural healthcare workers.
arXiv Detail & Related papers (2024-10-13T23:07:11Z)
Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia [0.0]
This paper proposes a solution using a localized large language model (LLM) to transcribe, translate, and summarize doctor-patient conversations. We utilize the Whisper model for transcription and GPT-3 to summarize them into the ePuskemas medical records format. This innovation addresses challenges like overcrowded facilities and the administrative burden on healthcare providers in Indonesia.
arXiv Detail & Related papers (2024-09-25T16:13:42Z)
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules. We develop a medical dialogue dataset comprising rule-based communications between patients and physicians. Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z)
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation [96.22329536480976]
We introduce the construction of a Healthcare Copilot designed for medical consultation. The proposed Healthcare Copilot comprises three main components: 1) the Dialogue component, responsible for effective and safe patient interactions; 2) the Memory component, storing both current conversation data and historical patient information; and 3) the Processing component, summarizing the entire dialogue and generating reports. To evaluate the proposed Healthcare Copilot, we implement an auto-evaluation scheme using ChatGPT for two roles: as a virtual patient engaging in dialogue with the copilot, and as an evaluator to assess the quality of the dialogue.
arXiv Detail & Related papers (2024-02-20T22:26:35Z)
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z)
Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset [26.504409173684653]
We introduce "ChatCoach", a human-AI cooperative framework designed to assist medical learners in practicing their communication skills during patient consultations. ChatCoachdifferentiates itself from conventional dialogue systems by offering a simulated environment where medical learners can practice dialogues with a patient agent, while a coach agent provides immediate, structured feedback. We have developed a dataset specifically for evaluating Large Language Models (LLMs) within the ChatCoach framework on communicative medical coaching tasks.
arXiv Detail & Related papers (2024-02-08T10:32:06Z)
Enhancing Summarization Performance through Transformer-Based Prompt Engineering in Automated Medical Reporting [0.49478969093606673]
Two-shot prompting approach in combination with scope and domain context outperforms other methods. The automated reports are approximately twice as long as the human references.
arXiv Detail & Related papers (2023-11-22T09:51:53Z)
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain. ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF. We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z)
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records [60.35217378132709]
Large language models (LLMs) can follow natural language instructions with human-level fluency. evaluating LLMs on realistic text generation tasks for healthcare remains challenging. We introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data.
arXiv Detail & Related papers (2023-08-27T12:24:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.