PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
- URL: http://arxiv.org/abs/2405.19266v2
- Date: Mon, 3 Jun 2024 15:27:10 GMT
- Title: PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
- Authors: Dingkang Yang, Jinjie Wei, Dongling Xiao, Shunli Wang, Tong Wu, Gang Li, Mingcheng Li, Shuaibing Wang, Jiawei Chen, Yue Jiang, Qingyao Xu, Ke Li, Peng Zhai, Lihua Zhang,
- Abstract summary: This paper builds PedCorpus, a high-quality dataset of over 300,000 multi-task instructions from pediatric textbooks, guidelines, and knowledge graph resources to fulfil diverse diagnostic demands.
Upon well-designed PedCorpus, we propose PediatricsGPT, the first Chinese pediatric LLM assistant built on a systematic and robust training pipeline.
- Score: 22.175201525690493
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing intelligent pediatric consultation systems offers promising prospects for improving diagnostic efficiency, especially in China, where healthcare resources are scarce. Despite recent advances in Large Language Models (LLMs) for Chinese medicine, their performance is sub-optimal in pediatric applications due to inadequate instruction data and vulnerable training procedures. To address the above issues, this paper builds PedCorpus, a high-quality dataset of over 300,000 multi-task instructions from pediatric textbooks, guidelines, and knowledge graph resources to fulfil diverse diagnostic demands. Upon well-designed PedCorpus, we propose PediatricsGPT, the first Chinese pediatric LLM assistant built on a systematic and robust training pipeline. In the continuous pre-training phase, we introduce a hybrid instruction pre-training mechanism to mitigate the internal-injected knowledge inconsistency of LLMs for medical domain adaptation. Immediately, the full-parameter Supervised Fine-Tuning (SFT) is utilized to incorporate the general medical knowledge schema into the models. After that, we devise a direct following preference optimization to enhance the generation of pediatrician-like humanistic responses. In the parameter-efficient secondary SFT phase, a mixture of universal-specific experts strategy is presented to resolve the competency conflict between medical generalist and pediatric expertise mastery. Extensive results based on the metrics, GPT-4, and doctor evaluations on distinct doctor downstream tasks show that PediatricsGPT consistently outperforms previous Chinese medical LLMs. Our model and dataset will be open-source for community development.
Related papers
- A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations [5.265452667976959]
Large Language Models (LLMs) have demonstrated surprising performance across various natural language processing tasks.
This survey systematically explores how to train medical LLMs based on general LLMs.
arXiv Detail & Related papers (2024-06-14T02:42:20Z) - MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway
Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework.
The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - Generative Large Language Models are autonomous practitioners of
evidence-based medicine [27.229179922424063]
Evidence-based medicine (EBM) is fundamental to modern clinical practice, requiring clinicians to continually update their knowledge and apply the best clinical evidence in patient care.
The practice of EBM faces challenges due to rapid advancements in medical research, leading to information overload for clinicians.
The integration of artificial intelligence (AI), specifically Generative Large Language Models (LLMs), offers a promising solution towards managing this complexity.
arXiv Detail & Related papers (2024-01-05T15:09:57Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - Large Language Models Illuminate a Progressive Pathway to Artificial
Healthcare Assistant: A Review [16.008511195589925]
Large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning.
This paper provides a comprehensive review on the applications and implications of LLMs in medicine.
arXiv Detail & Related papers (2023-11-03T13:51:36Z) - Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language
Model through Expert Feedback and Real-world Multi-turn Dialogue [4.558040877516838]
We introduce Zhongjing, the first Chinese medical Large Language Models (LLMs) that implements an entire training pipeline from continuous pre-training, SFT, to Reinforcement Learning from Human Feedback (RLHF)
We construct a Chinese multi-turn medical dialogue dataset of 70,000 authentic doctor-patient dialogues, CMtMedQA, which significantly enhances the model's capability for complex dialogue and proactive inquiry initiation.
arXiv Detail & Related papers (2023-08-07T12:56:13Z) - Towards Medical Artificial General Intelligence via Knowledge-Enhanced
Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks.
We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z) - Self-Supervised Knowledge Assimilation for Expert-Layman Text Style
Transfer [63.72621204057025]
Expert-layman text style transfer technologies have the potential to improve communication between scientific communities and the general public.
High-quality information produced by experts is often filled with difficult jargon laypeople struggle to understand.
This is a particularly notable issue in the medical domain, where layman are often confused by medical text online.
arXiv Detail & Related papers (2021-10-06T17:57:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.