ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model
Meta-AI (LLaMA) Using Medical Domain Knowledge
- URL: http://arxiv.org/abs/2303.14070v5
- Date: Sat, 24 Jun 2023 15:26:44 GMT
- Title: ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model
Meta-AI (LLaMA) Using Medical Domain Knowledge
- Authors: Yunxiang Li, Zihan Li, Kai Zhang, Ruilong Dan, Steve Jiang, You Zhang
- Abstract summary: The aim of this research was to create a specialized language model with enhanced accuracy in medical advice.
We achieved this by adapting and refining the large language model meta-AI (LLaMA) using a large dataset of 100,000 patient-doctor dialogues.
The fine-tuning of the model with real-world patient-doctor interactions significantly improved the model's ability to understand patient needs and provide informed advice.
- Score: 8.584905227066034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The primary aim of this research was to address the limitations observed in
the medical knowledge of prevalent large language models (LLMs) such as
ChatGPT, by creating a specialized language model with enhanced accuracy in
medical advice. We achieved this by adapting and refining the large language
model meta-AI (LLaMA) using a large dataset of 100,000 patient-doctor dialogues
sourced from a widely used online medical consultation platform. These
conversations were cleaned and anonymized to respect privacy concerns. In
addition to the model refinement, we incorporated a self-directed information
retrieval mechanism, allowing the model to access and utilize real-time
information from online sources like Wikipedia and data from curated offline
medical databases. The fine-tuning of the model with real-world patient-doctor
interactions significantly improved the model's ability to understand patient
needs and provide informed advice. By equipping the model with self-directed
information retrieval from reliable online and offline sources, we observed
substantial improvements in the accuracy of its responses. Our proposed
ChatDoctor, represents a significant advancement in medical LLMs, demonstrating
a significant improvement in understanding patient inquiries and providing
accurate advice. Given the high stakes and low error tolerance in the medical
field, such enhancements in providing accurate and reliable information are not
only beneficial but essential.
Related papers
- Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models.
Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z) - Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation [1.922611370494431]
This study evaluates the performance of large language models (LLMs) as medical agents in Portuguese.
The InternLM2 model, with initial training on medical data, presented the best overall performance.
DrBode models, derived from ChatBode, exhibited a phenomenon of catastrophic forgetting of acquired medical knowledge.
arXiv Detail & Related papers (2024-09-30T19:10:03Z) - STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering [58.79671189792399]
STLLaVA-Med is designed to train a policy model capable of auto-generating medical visual instruction data.
We validate the efficacy and data efficiency of STLLaVA-Med across three major medical Visual Question Answering (VQA) benchmarks.
arXiv Detail & Related papers (2024-06-28T15:01:23Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - MedInsight: A Multi-Source Context Augmentation Framework for Generating
Patient-Centric Medical Responses using Large Language Models [3.0874677990361246]
Large Language Models (LLMs) have shown impressive capabilities in generating human-like responses.
We propose MedInsight:a novel retrieval framework that augments LLM inputs with relevant background information.
Experiments on the MTSamples dataset validate MedInsight's effectiveness in generating contextually appropriate medical responses.
arXiv Detail & Related papers (2024-03-13T15:20:30Z) - OncoGPT: A Medical Conversational Model Tailored with Oncology Domain
Expertise on a Large Language Model Meta-AI (LLaMA) [6.486978719354015]
There is limited research on Large Language Models (LLMs) specifically addressing oncology-related queries.
We performed an extensive data collection of online question-answer interactions centered around oncology.
We observed a substantial enhancement in the model's understanding of genuine patient inquiries.
arXiv Detail & Related papers (2024-02-26T18:33:13Z) - Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP [9.432205523734707]
We introduce a new task of automatically generating lay definitions, aiming to simplify medical terms into patient-friendly lay language.
We first created the dataset, an extensive collection of over 50,000 unique (medical term, lay definition) pairs and 300,000 mentions.
We have also engineered a data-centric Human-AI pipeline that synergizes data filtering, augmentation, and selection to improve data quality.
arXiv Detail & Related papers (2023-12-24T23:01:00Z) - MKA: A Scalable Medical Knowledge Assisted Mechanism for Generative
Models on Medical Conversation Tasks [3.9571320117430866]
The mechanism aims to assist general neural generative models to achieve better performance on the medical conversation task.
The medical-specific knowledge graph is designed within the mechanism, which contains 6 types of medical-related information.
The evaluation results demonstrate that models combined with our mechanism outperform original methods in multiple automatic evaluation metrics.
arXiv Detail & Related papers (2023-12-05T04:55:54Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.