PMC-LLaMA: Towards Building Open-source Language Models for Medicine
- URL: http://arxiv.org/abs/2304.14454v3
- Date: Fri, 25 Aug 2023 14:08:38 GMT
- Title: PMC-LLaMA: Towards Building Open-source Language Models for Medicine
- Authors: Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi
Xie
- Abstract summary: Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding.
LLMs struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge.
We describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.
- Score: 62.39105735933138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Large Language Models (LLMs) have showcased remarkable capabilities
in natural language understanding. While demonstrating proficiency in everyday
conversations and question-answering situations, these models frequently
struggle in domains that require precision, such as medical applications, due
to their lack of domain-specific knowledge. In this paper, we describe the
procedure for building a powerful, open-source language model specifically
designed for medicine applications, termed as PMC-LLaMA. Our contributions are
threefold: (i) we systematically investigate the process of adapting a
general-purpose foundation language model towards medical domain, this involves
data-centric knowledge injection through the integration of 4.8M biomedical
academic papers and 30K medical textbooks, as well as comprehensive fine-tuning
for alignment with domain-specific instructions; (ii) we contribute a
large-scale, comprehensive dataset for instruction tuning. This dataset
encompasses medical question-answering (QA), rationale for reasoning, and
conversational dialogues, comprising a total of 202M tokens; (iii) we conduct
thorough ablation studies to demonstrate the effectiveness of each proposed
component. While evaluating on various public medical question-answering
benchmarks, our lightweight PMCLLaMA, which consists of only 13 billion
parameters, exhibits superior performance, even surpassing ChatGPT. All models,
codes, datasets can be found in https://github.com/chaoyi-wu/PMC-LLaMA.
Related papers
- A Survey of Medical Vision-and-Language Applications and Their Techniques [48.268198631277315]
Medical vision-and-language models (MVLMs) have attracted substantial interest due to their capability to offer a natural language interface for interpreting complex medical data.
Here, we provide a comprehensive overview of MVLMs and the various medical tasks to which they have been applied.
We also examine the datasets used for these tasks and compare the performance of different models based on standardized evaluation metrics.
arXiv Detail & Related papers (2024-11-19T03:27:05Z) - Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation [1.922611370494431]
This study evaluates the performance of large language models (LLMs) as medical agents in Portuguese.
The InternLM2 model, with initial training on medical data, presented the best overall performance.
DrBode models, derived from ChatBode, exhibited a phenomenon of catastrophic forgetting of acquired medical knowledge.
arXiv Detail & Related papers (2024-09-30T19:10:03Z) - Towards Evaluating and Building Versatile Large Language Models for Medicine [57.49547766838095]
We present MedS-Bench, a benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts.
MedS-Bench spans 11 high-level clinical tasks, including clinical report summarization, treatment recommendations, diagnosis, named entity recognition, and medical concept explanation.
MedS-Ins comprises 58 medically oriented language corpora, totaling 13.5 million samples across 122 tasks.
arXiv Detail & Related papers (2024-08-22T17:01:34Z) - GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI [67.09501109871351]
Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals.
GMAI-MMBench is the most comprehensive general medical AI benchmark with well-categorized data structure and multi-perceptual granularity to date.
It is constructed from 284 datasets across 38 medical image modalities, 18 clinical-related tasks, 18 departments, and 4 perceptual granularities in a Visual Question Answering (VQA) format.
arXiv Detail & Related papers (2024-08-06T17:59:21Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark
for Language Model Evaluation [22.986061896641083]
MedEval is a multi-level, multi-task, and multi-domain medical benchmark to facilitate the development of language models for healthcare.
With 22,779 collected sentences and 21,228 reports, we provide expert annotations at multiple levels, offering a granular potential usage of the data.
arXiv Detail & Related papers (2023-10-21T18:59:41Z) - A Unified Framework of Medical Information Annotation and Extraction for
Chinese Clinical Text [1.4841452489515765]
Current state-of-the-art (SOTA) NLP models are highly integrated with deep learning techniques.
This study presents an engineering framework of medical entity recognition, relation extraction and attribute extraction.
arXiv Detail & Related papers (2022-03-08T03:19:16Z) - Knowledge-Empowered Representation Learning for Chinese Medical Reading
Comprehension: Task, Model and Resources [36.960318276653986]
We introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences simultaneously.
We propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained language models.
Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.
arXiv Detail & Related papers (2020-08-24T11:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.