Large Language Model Distilling Medication Recommendation Model
- URL: http://arxiv.org/abs/2402.02803v1
- Date: Mon, 5 Feb 2024 08:25:22 GMT
- Title: Large Language Model Distilling Medication Recommendation Model
- Authors: Qidong Liu, Xian Wu, Xiangyu Zhao, Yuanshao Zhu, Zijian Zhang, Feng
Tian and Yefeng Zheng
- Abstract summary: We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
- Score: 61.89754499292561
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recommendation of medication is a vital aspect of intelligent healthcare
systems, as it involves prescribing the most suitable drugs based on a
patient's specific health needs. Unfortunately, many sophisticated models
currently in use tend to overlook the nuanced semantics of medical data, while
only relying heavily on identities. Furthermore, these models face significant
challenges in handling cases involving patients who are visiting the hospital
for the first time, as they lack prior prescription histories to draw upon. To
tackle these issues, we harness the powerful semantic comprehension and
input-agnostic characteristics of Large Language Models (LLMs). Our research
aims to transform existing medication recommendation methodologies using LLMs.
In this paper, we introduce a novel approach called Large Language Model
Distilling Medication Recommendation (LEADER). We begin by creating appropriate
prompt templates that enable LLMs to suggest medications effectively. However,
the straightforward integration of LLMs into recommender systems leads to an
out-of-corpus issue specific to drugs. We handle it by adapting the LLMs with a
novel output layer and a refined tuning loss function. Although LLM-based
models exhibit remarkable capabilities, they are plagued by high computational
costs during inference, which is impractical for the healthcare sector. To
mitigate this, we have developed a feature-level knowledge distillation
technique, which transfers the LLM's proficiency to a more compact model.
Extensive experiments conducted on two real-world datasets, MIMIC-III and
MIMIC-IV, demonstrate that our proposed model not only delivers effective
results but also is efficient. To ease the reproducibility of our experiments,
we release the implementation code online.
Related papers
- Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
This study proposes using large language models (LLMs) to elicit expert prior distributions for predictive models.
We compare LLM-elicited and uninformative priors, evaluate whether LLMs truthfully generate parameter distributions, and propose a model selection strategy for in-context learning and prior elicitation.
Our findings show that LLM-elicited prior parameter distributions significantly reduce predictive error compared to uninformative priors in low-data settings.
arXiv Detail & Related papers (2024-11-26T10:13:39Z) - Leveraging Large Language Models for Medical Information Extraction and Query Generation [2.1793134762413433]
This paper introduces a system that integrates large language models (LLMs) into the clinical trial retrieval process.
We evaluate six LLMs for query generation, focusing on open-source and relatively small models that require minimal computational resources.
arXiv Detail & Related papers (2024-10-31T12:01:51Z) - Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare.
This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z) - Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding [92.32881381717594]
We introduce ALternate Contrastive Decoding (ALCD) to solve hallucination issues in medical information extraction tasks.
ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods.
arXiv Detail & Related papers (2024-10-21T07:19:19Z) - Aligning (Medical) LLMs for (Counterfactual) Fairness [2.089191490381739]
Large Language Models (LLMs) have emerged as promising solutions for medical and clinical decision support applications.
LLMs are subject to different types of biases, which can lead to unfair treatment of individuals, worsening health disparities, and reducing trust in AI-augmented medical tools.
We present a new model alignment approach for aligning LLMs using a preference optimization method within a knowledge distillation framework.
arXiv Detail & Related papers (2024-08-22T01:11:27Z) - XAI4LLM. Let Machine Learning Models and LLMs Collaborate for Enhanced In-Context Learning in Healthcare [16.79952669254101]
We develop a novel method for zero-shot/few-shot in-context learning (ICL) using a multi-layered structured prompt.
We also explore the efficacy of two communication styles between the user and Large Language Models (LLMs)
Our study systematically evaluates the diagnostic accuracy and risk factors, including gender bias and false negative rates.
arXiv Detail & Related papers (2024-05-10T06:52:44Z) - Can LLMs' Tuning Methods Work in Medical Multimodal Domain? [14.659849302397433]
Large Language Models (LLMs) excel in world knowledge understanding, adapting them to specific subfields requires precise adjustments.
New Parameters-Efficient Fine-Tuning (PEFT) methods have emerged and achieved remarkable success in both LLMs and Large Vision-Language Models (LVLMs)
Can the fine-tuning methods for large models be transferred to the medical field to enhance transfer learning efficiency?
arXiv Detail & Related papers (2024-03-11T03:38:48Z) - Mitigating Object Hallucination in Large Vision-Language Models via
Classifier-Free Guidance [56.04768229686853]
Large Vision-Language Models (LVLMs) tend to hallucinate non-existing objects in the images.
We introduce a framework called Mitigating hallucinAtion via classifieR-Free guIdaNcE (MARINE)
MARINE is both training-free and API-free, and can effectively and efficiently reduce object hallucinations during the generation process.
arXiv Detail & Related papers (2024-02-13T18:59:05Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - Improving Small Language Models on PubMedQA via Generative Data
Augmentation [4.96649519549027]
Large Language Models (LLMs) have made remarkable advancements in the field of natural language processing.
Small Language Models (SLMs) are known for their efficiency, but they often struggle with limited capacity and training data.
We introduce a novel method aimed at improving SLMs in the medical domain using LLM-based generative data augmentation.
arXiv Detail & Related papers (2023-05-12T23:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.