Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset
- URL: http://arxiv.org/abs/2402.05547v2
- Date: Sat, 8 Jun 2024 16:36:56 GMT
- Title: Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset
- Authors: Hengguan Huang, Songtao Wang, Hongfu Liu, Hao Wang, Ye Wang,
- Abstract summary: We introduce "ChatCoach", a human-AI cooperative framework designed to assist medical learners in practicing their communication skills during patient consultations.
ChatCoachdifferentiates itself from conventional dialogue systems by offering a simulated environment where medical learners can practice dialogues with a patient agent, while a coach agent provides immediate, structured feedback.
We have developed a dataset specifically for evaluating Large Language Models (LLMs) within the ChatCoach framework on communicative medical coaching tasks.
- Score: 26.504409173684653
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional applications of natural language processing (NLP) in healthcare have predominantly focused on patient-centered services, enhancing patient interactions and care delivery, such as through medical dialogue systems. However, the potential of NLP to benefit inexperienced doctors, particularly in areas such as communicative medical coaching, remains largely unexplored. We introduce "ChatCoach", a human-AI cooperative framework designed to assist medical learners in practicing their communication skills during patient consultations. ChatCoach (Our data and code are available online: https://github.com/zerowst/Chatcoach)differentiates itself from conventional dialogue systems by offering a simulated environment where medical learners can practice dialogues with a patient agent, while a coach agent provides immediate, structured feedback. This is facilitated by our proposed Generalized Chain-of-Thought (GCoT) approach, which fosters the generation of structured feedback and enhances the utilization of external knowledge sources. Additionally, we have developed a dataset specifically for evaluating Large Language Models (LLMs) within the ChatCoach framework on communicative medical coaching tasks. Our empirical results validate the effectiveness of ChatCoach.
Related papers
- RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules.
We develop a medical dialogue dataset comprising rule-based communications between patients and physicians.
Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z) - Synthetic Patients: Simulating Difficult Conversations with Multimodal Generative AI for Medical Education [0.0]
Effective patient-centered communication is a core competency for physicians.
Both seasoned providers and medical trainees report decreased confidence in leading conversations on sensitive topics.
We present a novel educational tool designed to facilitate interactive, real-time simulations of difficult conversations in a video-based format.
arXiv Detail & Related papers (2024-05-30T11:02:08Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment [72.96949760114575]
We propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA)
GOMA formulates verbal communication as a planning problem that minimizes the misalignment between parts of agents' mental states that are relevant to the goals.
We evaluate our approach against strong baselines in two challenging environments, Overcooked (a multiplayer game) and VirtualHome (a household simulator)
arXiv Detail & Related papers (2024-03-17T03:52:52Z) - Healthcare Copilot: Eliciting the Power of General LLMs for Medical
Consultation [96.22329536480976]
We introduce the construction of a Healthcare Copilot designed for medical consultation.
The proposed Healthcare Copilot comprises three main components: 1) the Dialogue component, responsible for effective and safe patient interactions; 2) the Memory component, storing both current conversation data and historical patient information; and 3) the Processing component, summarizing the entire dialogue and generating reports.
To evaluate the proposed Healthcare Copilot, we implement an auto-evaluation scheme using ChatGPT for two roles: as a virtual patient engaging in dialogue with the copilot, and as an evaluator to assess the quality of the dialogue.
arXiv Detail & Related papers (2024-02-20T22:26:35Z) - Validating a virtual human and automated feedback system for training
doctor-patient communication skills [3.0354760313198796]
We present the development and validation of a scalable, easily accessible, digital tool known as the Standardized Online Patient for Health Interaction Education (SOPHIE)
We found that participants who underwent SOPHIE performed significantly better than the control in overall communication, aggregate scores, empowering the patient, and showing empathy.
One day, we hope that SOPHIE will help make communication training resources more accessible by providing a scalable option to supplement existing resources.
arXiv Detail & Related papers (2023-06-27T05:23:08Z) - PlugMed: Improving Specificity in Patient-Centered Medical Dialogue
Generation using In-Context Learning [20.437165038293426]
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge.
It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance.
Inspired by in-context learning, we propose PlugMed, a Plug-and-Play Medical Dialogue System.
arXiv Detail & Related papers (2023-05-19T08:18:24Z) - A Flexible Schema-Guided Dialogue Management Framework: From Friendly
Peer to Virtual Standardized Cancer Patient [2.1530718840070784]
We describe a general-purpose schema-guided dialogue management framework used to develop SOPHIE, a virtual standardized cancer patient.
Our agent is judged to produce responses that are natural, emotionally appropriate, and consistent with her role as a cancer patient.
arXiv Detail & Related papers (2022-07-15T03:52:00Z) - Domain-specific Language Pre-training for Dialogue Comprehension on
Clinical Inquiry-Answering Conversations [28.567701055153385]
Recent developments in natural language processing suggest that large-scale pre-trained language backbones could be leveraged for machine comprehension and information extraction tasks.
Yet, due to the gap between pre-training and downstream clinical domains, it remains challenging to exploit the generic backbones for domain-specific applications.
We propose a domain-specific language pre-training, to improve performance on downstream tasks like dialogue comprehension.
arXiv Detail & Related papers (2022-06-06T08:45:03Z) - A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks
and Datasets [70.32630628211803]
We propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction.
A new large medical dialogue dataset with multi-level fine-grained annotations is introduced.
We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.
arXiv Detail & Related papers (2022-04-19T16:43:21Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.