DISC-MedLLM: Bridging General Large Language Models and Real-World
Medical Consultation
- URL: http://arxiv.org/abs/2308.14346v1
- Date: Mon, 28 Aug 2023 06:41:49 GMT
- Title: DISC-MedLLM: Bridging General Large Language Models and Real-World
Medical Consultation
- Authors: Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong,
Jiajie Peng, Xuanjing Huang, Zhongyu Wei
- Abstract summary: We propose DISC-MedLLM to provide accurate and truthful medical response in end-to-end conversational healthcare services.
We employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference rephrasing.
- Score: 37.08249140671163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose DISC-MedLLM, a comprehensive solution that leverages Large
Language Models (LLMs) to provide accurate and truthful medical response in
end-to-end conversational healthcare services. To construct high-quality
Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing
medical knowledge-graphs, reconstructing real-world dialogues, and
incorporating human-guided preference rephrasing. These datasets are
instrumental in training DISC-MedLLM, surpassing existing medical LLMs in both
single-turn and multi-turn consultation scenarios. Extensive experimental
results demonstrate the effectiveness of the proposed model in bridging the gap
between general language models and real-world medical consultation.
Additionally, we release the constructed dataset and model weights to further
contribute to research and development. Further details and resources can be
found at https://github.com/FudanDISC/DISC-MedLLM
Related papers
- MedThink: Inducing Medical Large-scale Visual Language Models to Hallucinate Less by Thinking More [20.59298361626719]
Large Vision Language Models (LVLMs) are applied to multimodal medical generative tasks.
LVLMs suffer from significant model hallucination issues.
In this paper, we introduce a method that mimics human cognitive processes to construct fine-grained instruction pairs.
arXiv Detail & Related papers (2024-06-17T12:03:32Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English
Clinical Queries [16.101969130235055]
We introduce the Multimodal Medical Codemixed Question Summarization MMCQS dataset.
This dataset combines Hindi-English codemixed medical queries with visual aids.
Our dataset, code, and pre-trained models will be made publicly available.
arXiv Detail & Related papers (2024-01-03T07:58:25Z) - CLIPSyntel: CLIP and LLM Synergy for Multimodal Question Summarization
in Healthcare [16.033112094191395]
We introduce the Multimodal Medical Question Summarization (MMQS) dataset.
This dataset pairs medical queries with visual aids, facilitating a richer and more nuanced understanding of patient needs.
We also propose a framework, consisting of four modules that identify medical disorders, generate relevant context, filter medical concepts, and craft visually aware summaries.
arXiv Detail & Related papers (2023-12-16T03:02:05Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - Customizing General-Purpose Foundation Models for Medical Report
Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks.
We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - Two heads are better than one: Enhancing medical representations by
pre-training over structured and unstructured electronic health records [23.379185792773875]
We propose a unified deep learning-based medical pre-trained language model, named UMM-PLM, to automatically learn representative features from multimodal EHRs.
We first developed parallel unimodal information representation modules to capture the unimodal-specific characteristic, where unimodal representations were learned from each data source separately.
A cross-modal module was further introduced to model the interactions between different modalities.
arXiv Detail & Related papers (2022-01-25T06:14:49Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.