BianCang: A Traditional Chinese Medicine Large Language Model
- URL: http://arxiv.org/abs/2411.11027v1
- Date: Sun, 17 Nov 2024 10:17:01 GMT
- Title: BianCang: A Traditional Chinese Medicine Large Language Model
- Authors: Sibo Wei, Xueping Peng, Yi-fei Wang, Jiasheng Si, Weiyu Zhang, Wenpeng Lu, Xiaoming Wu, Yinglong Wang,
- Abstract summary: BianCang is a TCM-specific large language model (LLMs) that first injects domain-specific knowledge and then aligns it through targeted stimulation.
We constructed pre-training corpora, instruction-aligned datasets based on real hospital records, and the ChP-TCM dataset derived from the Pharmacopoeia of the People's Republic of China.
We compiled extensive TCM and medical corpora for continuous pre-training and supervised fine-tuning, building a comprehensive dataset to refine the model's understanding of TCM.
- Score: 22.582027277167047
- License:
- Abstract: The rise of large language models (LLMs) has driven significant progress in medical applications, including traditional Chinese medicine (TCM). However, current medical LLMs struggle with TCM diagnosis and syndrome differentiation due to substantial differences between TCM and modern medical theory, and the scarcity of specialized, high-quality corpora. This paper addresses these challenges by proposing BianCang, a TCM-specific LLM, using a two-stage training process that first injects domain-specific knowledge and then aligns it through targeted stimulation. To enhance diagnostic and differentiation capabilities, we constructed pre-training corpora, instruction-aligned datasets based on real hospital records, and the ChP-TCM dataset derived from the Pharmacopoeia of the People's Republic of China. We compiled extensive TCM and medical corpora for continuous pre-training and supervised fine-tuning, building a comprehensive dataset to refine the model's understanding of TCM. Evaluations across 11 test sets involving 29 models and 4 tasks demonstrate the effectiveness of BianCang, offering valuable insights for future research. Code, datasets, and models are available at https://github.com/QLU-NLP/BianCang.
Related papers
- Intelligent Understanding of Large Language Models in Traditional Chinese Medicine Based on Prompt Engineering Framework [3.990633038739491]
We propose TCM-Prompt, a framework that integrates various pre-trained language models (PLMs), templates, tokenization, and verbalization methods.
We conducted experiments on disease classification, syndrome identification, herbal medicine recommendation, and general NLP tasks.
arXiv Detail & Related papers (2024-10-25T10:24:30Z) - LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation [2.04367431902848]
The COVID-19 pandemic underscored major deficiencies in traditional healthcare systems.
Existing studies face two main challenges.
First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns.
Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations.
arXiv Detail & Related papers (2024-09-27T00:01:32Z) - TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models [22.76485170022542]
We introduce a new medical question-answering (QA) dataset that contains massive manual instruction for solving Traditional Chinese Medicine examination tasks.
Our TCMD collects massive questions across diverse domains with their annotated medical subjects.
arXiv Detail & Related papers (2024-06-07T13:48:15Z) - MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large
Language Models [56.36916128631784]
We introduce MedBench, a comprehensive benchmark for the Chinese medical domain.
This benchmark is composed of four key components: the Chinese Medical Licensing Examination, the Resident Standardization Training Examination, and real-world clinic cases.
We perform extensive experiments and conduct an in-depth analysis from diverse perspectives, which culminate in the following findings.
arXiv Detail & Related papers (2023-12-20T07:01:49Z) - RoKEPG: RoBERTa and Knowledge Enhancement for Prescription Generation of
Traditional Chinese Medicine [2.1098688291287475]
We propose a RoBERTa and Knowledge Enhancement model for Prescription Generation of Traditional Chinese Medicine (RoKEPG)
RoKEPG is guided to generate TCM prescriptions by introducing four classes of knowledge of TCM through the attention mask matrix.
Experimental results on the publicly available TCM prescription dataset show that RoKEPG improves the F1 metric by about 2% over the baseline model.
arXiv Detail & Related papers (2023-11-29T01:59:38Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - TCM-GPT: Efficient Pre-training of Large Language Models for Domain
Adaptation in Traditional Chinese Medicine [11.537289359051975]
We propose a novel TCMDA (TCM Domain Adaptation) approach, efficient pre-training with domain-specific corpus.
Specifically, we first construct a large TCM-specific corpus, TCM-Corpus-1B, by identifying domain keywords and retreving from general corpus.
Then, our TCMDA leverages the LoRA which freezes the pretrained model's weights and uses rank decomposition matrices to efficiently train specific dense layers for pre-training and fine-tuning.
arXiv Detail & Related papers (2023-11-03T08:54:50Z) - PMC-LLaMA: Towards Building Open-source Language Models for Medicine [62.39105735933138]
Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
arXiv Detail & Related papers (2023-04-27T18:29:05Z) - Multi-Task Learning for Post-transplant Cause of Death Analysis: A Case
Study on Liver Transplant [65.85767739748901]
Post-transplant cause of death provides powerful tool for clinical decision making.
Traditional methods like Model for End-stage Liver Disease (MELD) score and conventional machine learning (ML) methods are limited in CoD analysis.
We propose a novel framework called CoD-MTL leveraging multi-task learning to model the semantic relationships between various CoD prediction tasks jointly.
arXiv Detail & Related papers (2023-03-30T01:31:49Z) - Competence-based Multimodal Curriculum Learning for Medical Report
Generation [98.10763792453925]
We propose a Competence-based Multimodal Curriculum Learning framework ( CMCL) to alleviate the data bias and make best use of available data.
Specifically, CMCL simulates the learning process of radiologists and optimize the model in a step by step manner.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that CMCL can be incorporated into existing models to improve their performance.
arXiv Detail & Related papers (2022-06-24T08:16:01Z) - Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report
Generation With Alternate Learning [70.71564065885542]
We propose to use the medical visual language BERT (Medical-VLBERT) model to identify the abnormality on the COVID-19 scans.
This model adopts an alternate learning strategy with two procedures that are knowledge pretraining and transferring.
For automatic medical report generation on the COVID-19 cases, we constructed a dataset of 368 medical findings in Chinese and 1104 chest CT scans.
arXiv Detail & Related papers (2021-08-11T07:12:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.