Related papers: Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations

Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations

URL: http://arxiv.org/abs/2501.17860v1
Date: Wed, 29 Jan 2025 18:58:48 GMT
Title: Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations
Authors: Zijie Liu, Xinyu Zhao, Jie Peng, Zhuangdi Zhu, Qingyu Chen, Xia Hu, Tianlong Chen,
Abstract summary: We introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards.<n>We also explore dialogue-based fine-tuning, which transforms static datasets into conversational formats to better capture iterative reasoning processes.<n>Experiments show that dialogue-tuned models outperform traditional methods, with improvements of $9.64%$ in multi-round reasoning scenarios and $6.18%$ in accuracy in a noisy environment.
Score: 74.83732294523402
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current medical AI systems often fail to replicate real-world clinical reasoning, as they are predominantly trained and evaluated on static text and question-answer tasks. These tuning methods and benchmarks overlook critical aspects like evidence-based reasoning and handling distracting information. To bridge this gap, we introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards. Moreover, we explore dialogue-based fine-tuning, which transforms static datasets into conversational formats to better capture iterative reasoning processes. Experiments show that dialogue-tuned models outperform traditional methods, with improvements of $9.64\%$ in multi-round reasoning scenarios and $6.18\%$ in accuracy in a noisy environment. Our findings highlight dialogue tuning as a promising approach for advancing clinically aligned and robust medical AI systems.

Related papers

Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue [14.95390953068765]
Large language models (LLMs) have demonstrated excellent capabilities in the field of biomedical question answering, but their application in real-world clinical consultations still faces core challenges.<n>We propose Ours, a reinforcement learning (RL)-based multi-agent collaborative framework that models medical consultations as a dynamic decision-making process under uncertainty.<n>Our approach shows immense practical value by reducing misdiagnosis risks in time-pressured settings, freeing clinicians for complex cases, and pioneering a strategy to optimize medical resource allocation and alleviate workforce shortages.
arXiv Detail & Related papers (2025-05-26T07:48:14Z)
TRUST: An LLM-Based Dialogue System for Trauma Understanding and Structured Assessments [8.618945530676614]
This study aims to bridge the gap in mental healthcare accessibility by developing an LLM-powered dialogue system that replicates clinician behavior. We introduce TRUST, a framework of cooperative LLM modules capable of conducting formal diagnostic interviews and assessments for PTSD. We develop a patient simulation approach based on real-life interview transcripts to replace time-consuming and costly manual testing by clinicians.
arXiv Detail & Related papers (2025-04-30T17:58:06Z)
EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records [11.013242961199204]
We propose EMRModel, a novel approach that integrates LoRA-based fine-tuning with code-style prompt design. We construct a high-quality, realistically grounded dataset of medical consultation dialogues with detailed annotations. Experimental results show EMRModel achieves an F1 score of 88.1%, improving by49.5% over standard pre-trained models.
arXiv Detail & Related papers (2025-04-23T06:17:55Z)
3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark [0.29987253996125257]
Large Vision-Language Models (LVLMs) are being explored for applications in telemedicine, yet their ability to engage with diverse patient behaviors remains underexplored. We introduce 3MDBench, an open-source evaluation framework designed to assess LLM-driven medical consultations. The benchmark integrates textual and image-based patient data across 34 common diagnoses, mirroring real-world telemedicine interactions.
arXiv Detail & Related papers (2025-03-26T07:32:05Z)
MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues [41.23757609484281]
Speech recognition errors can significantly degrade the performance of downstream tasks like summarization.<n>We propose MEDSAGE, an approach for generating synthetic samples for data augmentation using Large Language Models.<n>LLMs can effectively model ASR noise, and incorporating this noisy data into the training process significantly improves the robustness and accuracy of medical dialogue summarization systems.
arXiv Detail & Related papers (2024-08-26T17:04:00Z)
Synthetic Patient-Physician Dialogue Generation from Clinical Notes Using LLM [27.33193944412666]
Medical dialogue systems (MDS) enhance patient-physician communication, improve healthcare accessibility, and reduce costs. However, acquiring suitable data to train these systems poses significant challenges. Our approach, SynDial, uses a single LLM iteratively with zero-shot prompting and a feedback loop to generate high-quality synthetic dialogues.
arXiv Detail & Related papers (2024-08-12T16:49:22Z)
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions. VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information. We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z)
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z)
Show from Tell: Audio-Visual Modelling in Clinical Settings [58.88175583465277]
We consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations without human expert annotation. A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose. The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference.
arXiv Detail & Related papers (2023-10-25T08:55:48Z)
PlugMed: Improving Specificity in Patient-Centered Medical Dialogue Generation using In-Context Learning [20.437165038293426]
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge. It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance. Inspired by in-context learning, we propose PlugMed, a Plug-and-Play Medical Dialogue System.
arXiv Detail & Related papers (2023-05-19T08:18:24Z)
Semi-Supervised Variational Reasoning for Medical Dialogue Generation [70.838542865384]
Two key characteristics are relevant for medical dialogue generation: patient states and physician actions. We propose an end-to-end variational reasoning approach to medical dialogue generation. A physician policy network composed of an action-classifier and two reasoning detectors is proposed for augmented reasoning ability.
arXiv Detail & Related papers (2021-05-13T04:14:35Z)
Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data. Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.