Distilling Large Language Models for Biomedical Knowledge Extraction: A
Case Study on Adverse Drug Events
- URL: http://arxiv.org/abs/2307.06439v1
- Date: Wed, 12 Jul 2023 20:08:48 GMT
- Title: Distilling Large Language Models for Biomedical Knowledge Extraction: A
Case Study on Adverse Drug Events
- Authors: Yu Gu, Sheng Zhang, Naoto Usuyama, Yonas Woldesenbet, Cliff Wong,
Praneeth Sanapathi, Mu Wei, Naveen Valluri, Erika Strandberg, Tristan
Naumann, Hoifung Poon
- Abstract summary: We study how large language models (LLMs) can be used to scale biomedical knowledge curation.
We find that substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access.
- Score: 17.73671383380315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs), such as GPT-4, have demonstrated remarkable
capabilities across a wide range of tasks, including health applications. In
this paper, we study how LLMs can be used to scale biomedical knowledge
curation. We find that while LLMs already possess decent competency in
structuring biomedical text, by distillation into a task-specific student model
through self-supervised learning, substantial gains can be attained over
out-of-box LLMs, with additional advantages such as cost, efficiency, and
white-box model access.
We conduct a case study on adverse drug event (ADE) extraction, which is an
important area for improving care. On standard ADE extraction evaluation, a
GPT-3.5 distilled PubMedBERT model attained comparable accuracy as supervised
state-of-the-art models without using any labeled data. Despite being over
1,000 times smaller, the distilled model outperformed its teacher GPT-3.5 by
over 6 absolute points in F1 and GPT-4 by over 5 absolute points.
Ablation studies on distillation model choice (e.g., PubMedBERT vs BioGPT)
and ADE extraction architecture shed light on best practice for biomedical
knowledge extraction. Similar gains were attained by distillation for other
standard biomedical knowledge extraction tasks such as gene-disease
associations and protected health information, further illustrating the promise
of this approach.
Related papers
- Biomedical Large Languages Models Seem not to be Superior to Generalist Models on Unseen Medical Data [3.469567586411153]
Large language models (LLMs) have shown potential in biomedical applications, leading to efforts to fine-tune them on domain-specific data.
This study evaluates the performance of biomedically fine-tuned LLMs against their general-purpose counterparts on a variety of clinical tasks.
arXiv Detail & Related papers (2024-08-25T13:36:22Z) - Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources [13.750202656564907]
Adverse event (AE) extraction is crucial for monitoring and analyzing the safety profiles of immunizations.
This study aims to evaluate the effectiveness of large language models (LLMs) and traditional deep learning models in AE extraction.
arXiv Detail & Related papers (2024-06-26T03:56:21Z) - BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers [48.21255861863282]
BMRetriever is a series of dense retrievers for enhancing biomedical retrieval.
BMRetriever exhibits strong parameter efficiency, with the 410M variant outperforming baselines up to 11.7 times larger.
arXiv Detail & Related papers (2024-04-29T05:40:08Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - A comparative study of zero-shot inference with large language models
and supervised modeling in breast cancer pathology classification [1.4715634464004446]
Large language models (LLMs) have demonstrated promising transfer learning capability.
LLMs demonstrated the potential to speed up the execution of clinical NLP studies by reducing the need for curating large annotated datasets.
This may result in an increase in the utilization of NLP-based variables and outcomes in observational clinical studies.
arXiv Detail & Related papers (2024-01-25T02:05:31Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models [1.9665865095034865]
We formulate the relation extraction task as binary classifications for large language models.
We designate the main title as the tail entity and explicitly incorporate it into the context.
Longer contents are sliced into text chunks, embedded, and retrieved with additional embedding models.
arXiv Detail & Related papers (2023-12-13T16:43:41Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Evaluation of ChatGPT Family of Models for Biomedical Reasoning and
Classification [6.163540203358258]
This study investigates the performance of large language models (LLMs) in biomedical tasks beyond question-answering.
Because no patient data can be passed to the OpenAI API public interface, we evaluated model performance with over 10000 samples.
We found that fine-tuning for two fundamental NLP tasks remained the best strategy.
arXiv Detail & Related papers (2023-04-05T15:11:25Z) - SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for
Lightweight Skin Lesion Classification Using Dermoscopic Images [62.60956024215873]
Skin cancer is one of the most common types of malignancy, affecting a large population and causing a heavy economic burden worldwide.
Most studies in skin cancer detection keep pursuing high prediction accuracies without considering the limitation of computing resources on portable devices.
This study specifically proposes a novel method, termed SSD-KD, that unifies diverse knowledge into a generic KD framework for skin diseases classification.
arXiv Detail & Related papers (2022-03-22T06:54:29Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.