Related papers: Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback

Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback

URL: http://arxiv.org/abs/2401.05695v1
Date: Thu, 11 Jan 2024 06:42:45 GMT
Title: Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback
Authors: Chengfeng Dou, Zhi Jin, Wenpin Jiao, Haiyan Zhao, Yongqiang Zhao, Zhenwei Tao
Abstract summary: We propose an approach called preference learning from process feedback. PLPF integrates the doctor's diagnostic logic into LLMs. We show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%.
Score: 20.73076574974894
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The use of large language models in medical dialogue generation has garnered significant attention, with a focus on improving response quality and fluency. While previous studies have made progress in optimizing model performance for single-round medical Q&A tasks, there is a need to enhance the model's capability for multi-round conversations to avoid logical inconsistencies. To address this, we propose an approach called preference learning from process feedback~(PLPF), which integrates the doctor's diagnostic logic into LLMs. PLPF involves rule modeling, preference data generation, and preference alignment to train the model to adhere to the diagnostic process. Experimental results using Standardized Patient Testing show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%, outperforming traditional reinforcement learning from human feedback. Additionally, PLPF demonstrates effectiveness in both multi-round and single-round dialogue tasks, showcasing its potential for improving medical dialogue generation.

Related papers

DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue [14.95390953068765]
Large language models (LLMs) have demonstrated excellent capabilities in the field of biomedical question answering, but their application in real-world clinical consultations still faces core challenges.<n>We propose Ours, a reinforcement learning (RL)-based multi-agent collaborative framework that models medical consultations as a dynamic decision-making process under uncertainty.<n>Our approach shows immense practical value by reducing misdiagnosis risks in time-pressured settings, freeing clinicians for complex cases, and pioneering a strategy to optimize medical resource allocation and alleviate workforce shortages.
arXiv Detail & Related papers (2025-05-26T07:48:14Z)
EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records [11.013242961199204]
We propose EMRModel, a novel approach that integrates LoRA-based fine-tuning with code-style prompt design. We construct a high-quality, realistically grounded dataset of medical consultation dialogues with detailed annotations. Experimental results show EMRModel achieves an F1 score of 88.1%, improving by49.5% over standard pre-trained models.
arXiv Detail & Related papers (2025-04-23T06:17:55Z)
ProMRVL-CAD: Proactive Dialogue System with Multi-Round Vision-Language Interactions for Computer-Aided Diagnosis [0.7430974817507225]
We develop an LLM-based dialogue system, namely proactive multi-round vision-language interactions for computer-aided diagnosis (ProMRVL-CAD) The proposed ProMRVL-CAD system allows proactive dialogue to provide patients with constant and reliable medical access via an integration of knowledge graph into a recommendation system.
arXiv Detail & Related papers (2025-02-15T01:14:23Z)
Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation [0.0]
Large language models (LLMs) have shown impressive capabilities in natural language processing tasks, including dialogue generation. This research aims to conduct a novel comparative analysis of two prominent techniques, fine-tuning with LoRA and the Retrieval-Augmented Generation framework.
arXiv Detail & Related papers (2025-02-04T11:50:40Z)
Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations [74.83732294523402]
We introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards. We also explore dialogue-based fine-tuning, which transforms static datasets into conversational formats to better capture iterative reasoning processes. Experiments show that dialogue-tuned models outperform traditional methods, with improvements of $9.64%$ in multi-round reasoning scenarios and $6.18%$ in accuracy in a noisy environment.
arXiv Detail & Related papers (2025-01-29T18:58:48Z)
Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment [22.983780823136925]
This research examines the use of Reinforcement Learning from AI Feedback (RLAIF) techniques to improve healthcare dialogue models. We argue that the primary challenges in current RLAIF research for healthcare are the limitations of automated evaluation methods. We present a new evaluation framework based on standardized patient examinations.
arXiv Detail & Related papers (2024-10-05T10:29:19Z)
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules. We develop a medical dialogue dataset comprising rule-based communications between patients and physicians. Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z)
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions. VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information. We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z)
Customizing General-Purpose Foundation Models for Medical Report Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks. We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z)
PlugMed: Improving Specificity in Patient-Centered Medical Dialogue Generation using In-Context Learning [20.437165038293426]
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge. It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance. Inspired by in-context learning, we propose PlugMed, a Plug-and-Play Medical Dialogue System.
arXiv Detail & Related papers (2023-05-19T08:18:24Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing [5.022185333260402]
Diagnostic Reasoning Benchmarks, DR.BENCH, is a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models.
arXiv Detail & Related papers (2022-09-29T16:05:53Z)
An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers [5.2116528363639985]
Generative-based approaches, such as the OpenAI GPT models, could allow for more dynamic conversations in therapy contexts. We built a chatbots using the GPT-2 model and fine-tuned it with 306 therapy session transcripts between family caregivers of individuals with dementia and therapists conducting Problem Solving Therapy. Results showed that the fine-tuned model created more non-word outputs than the pre-trained model.
arXiv Detail & Related papers (2021-07-28T01:01:08Z)
Semi-Supervised Variational Reasoning for Medical Dialogue Generation [70.838542865384]
Two key characteristics are relevant for medical dialogue generation: patient states and physician actions. We propose an end-to-end variational reasoning approach to medical dialogue generation. A physician policy network composed of an action-classifier and two reasoning detectors is proposed for augmented reasoning ability.
arXiv Detail & Related papers (2021-05-13T04:14:35Z)
MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG. We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation. Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.