Knowledge-tuning Large Language Models with Structured Medical Knowledge
Bases for Reliable Response Generation in Chinese
- URL: http://arxiv.org/abs/2309.04175v1
- Date: Fri, 8 Sep 2023 07:42:57 GMT
- Title: Knowledge-tuning Large Language Models with Structured Medical Knowledge
Bases for Reliable Response Generation in Chinese
- Authors: Haochun Wang, Sendong Zhao, Zewen Qiang, Zijian Li, Nuwa Xi, Yanrui
Du, MuZhen Cai, Haoqiang Guo, Yuhan Chen, Haoming Xu, Bing Qin, Ting Liu
- Abstract summary: Large Language Models (LLMs) have demonstrated remarkable success in diverse natural language processing (NLP) tasks in general domains.
We propose knowledge-tuning, which leverages structured medical knowledge bases for the LLMs to grasp domain knowledge efficiently.
We also release cMedKnowQA, a Chinese medical knowledge question-answering dataset constructed from medical knowledge bases.
- Score: 29.389119917322102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have demonstrated remarkable success in diverse
natural language processing (NLP) tasks in general domains. However, LLMs
sometimes generate responses with the hallucination about medical facts due to
limited domain knowledge. Such shortcomings pose potential risks in the
utilization of LLMs within medical contexts. To address this challenge, we
propose knowledge-tuning, which leverages structured medical knowledge bases
for the LLMs to grasp domain knowledge efficiently and facilitate reliable
response generation. We also release cMedKnowQA, a Chinese medical knowledge
question-answering dataset constructed from medical knowledge bases to assess
the medical knowledge proficiency of LLMs. Experimental results show that the
LLMs which are knowledge-tuned with cMedKnowQA, can exhibit higher levels of
accuracy in response generation compared with vanilla instruction-tuning and
offer a new reliable way for the domain adaptation of LLMs.
Related papers
- Reliable and diverse evaluation of LLM medical knowledge mastery [6.825565574784612]
We propose a novel framework that generates reliable and diverse test samples to evaluate medical-specific LLMs.
We use our proposed framework to systematically investigate the mastery of medical factual knowledge of 12 well-known LLMs.
arXiv Detail & Related papers (2024-09-22T03:13:38Z) - MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering [5.065947993017158]
Large Language Models (LLMs) have demonstrated an impressive ability to encode knowledge during pre-training on large text corpora.
We examine the capability of LLMs to exhibit medical knowledge recall by constructing a novel dataset derived from systematic reviews.
arXiv Detail & Related papers (2024-06-09T16:33:28Z) - KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs.
KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z) - EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with
Epilepsy Medical Knowledge [28.409333447902693]
Large language models (LLMs) achieve remarkable performance in comprehensive and generative ability.
In this work, we focus on the particular disease of Epilepsy with Japanese language and introduce a customized LLM termed as EpilepsyLLM.
The datasets contain knowledge of basic information about disease, common treatment methods and drugs, and important notes in life and work.
arXiv Detail & Related papers (2024-01-11T13:39:00Z) - Mitigating Large Language Model Hallucinations via Autonomous Knowledge
Graph-based Retrofitting [51.7049140329611]
This paper proposes Knowledge Graph-based Retrofitting (KGR) to mitigate factual hallucination during the reasoning process.
Experiments show that KGR can significantly improve the performance of LLMs on factual QA benchmarks.
arXiv Detail & Related papers (2023-11-22T11:08:38Z) - A Survey of Large Language Models in Medicine: Progress, Application, and Challenge [85.09998659355038]
Large language models (LLMs) have received substantial attention due to their capabilities for understanding and generating human language.
This review aims to provide a detailed overview of the development and deployment of LLMs in medicine.
arXiv Detail & Related papers (2023-11-09T02:55:58Z) - PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain [24.411904114158673]
We re-build the Chinese Biomedical Language Understanding Evaluation (CBlue) benchmark into a large scale prompt-tuning benchmark, PromptCBlue.
Our benchmark is a suitable test-bed and an online platform for evaluating Chinese LLMs' multi-task capabilities on a wide range bio-medical tasks.
arXiv Detail & Related papers (2023-10-22T02:20:38Z) - Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models.
Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z) - Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.