Related papers: KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model

KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model

URL: http://arxiv.org/abs/2502.20350v1
Date: Thu, 27 Feb 2025 18:22:33 GMT
Title: KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model
Authors: Kai Zhang, Rui Zhu, Shutian Ma, Jingwei Xiong, Yejin Kim, Fabricio Murai, Xiaozhong Liu,
Abstract summary: We utilize open-source drug knowledge graphs, clinical trial data, and PubMed publications to construct a comprehensive dataset for the explainable drug discovery task.<n>We introduce textbfKEDRec-LM, an instruction-tuned LLM which distills knowledge from rich medical knowledge corpus for drug recommendation and rationale generation.
Score: 16.712453010522673
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Drug discovery is a critical task in biomedical natural language processing (NLP), yet explainable drug discovery remains underexplored. Meanwhile, large language models (LLMs) have shown remarkable abilities in natural language understanding and generation. Leveraging LLMs for explainable drug discovery has the potential to improve downstream tasks and real-world applications. In this study, we utilize open-source drug knowledge graphs, clinical trial data, and PubMed publications to construct a comprehensive dataset for the explainable drug discovery task, named \textbf{expRxRec}. Furthermore, we introduce \textbf{KEDRec-LM}, an instruction-tuned LLM which distills knowledge from rich medical knowledge corpus for drug recommendation and rationale generation. To encourage further research in this area, we will publicly release\footnote{A copy is attached with this submission} both the dataset and KEDRec-LM.

Related papers

Do "New Snow Tablets" Contain Snow? Large Language Models Over-Rely on Names to Identify Ingredients of Chinese Drugs [79.00288739947406]
Traditional Chinese Medicine (TCM) has seen increasing adoption in healthcare, with specialized Large Language Models (LLMs) emerging to support clinical applications.<n>A fundamental requirement for these models is accurate identification of TCM drug ingredients.<n>Our systematic analysis reveals consistent failure patterns: models often interpret drug names literally, overuse common herbs regardless of relevance, and exhibit erratic behaviors when faced with unfamiliar formulations.
arXiv Detail & Related papers (2025-04-03T17:43:45Z)
Automated, LLM enabled extraction of synthesis details for reticular materials from scientific literature [29.097783516208892]
We introduce a Knowledge Extraction Pipeline (KEP) that automatizes LLM-assisted paragraph classification and information extraction. We demonstrate that LLMs can retrieve chemical information from PDF documents, without the need for fine-tuning or training. The results show the potential of the KEP approach for reducing human annotations and data curation efforts.
arXiv Detail & Related papers (2024-11-05T20:08:23Z)
Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development [24.5979645373074]
Y-Mol is a knowledge-guided LLM designed to accomplish tasks across lead compound discovery, pre-clinic, and clinic prediction. It learns from a corpus of publications, knowledge graphs, and expert-designed synthetic data. Y-Mol significantly outperforms general-purpose LLMs in discovering lead compounds, predicting molecular properties, and identifying drug interaction events.
arXiv Detail & Related papers (2024-10-15T12:39:20Z)
MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance [17.008132675107355]
This paper focuses on the problem of Pharmacovigilance (PhV), where the significance and challenges lie in identifying Adverse Drug Events (ADEs) from diverse text sources. We present MALADE, the first effective collaborative multi-agent system powered by Large Language Models with Retrieval Augmented Generation for ADE extraction from drug label data.
arXiv Detail & Related papers (2024-08-03T22:14:13Z)
Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models [46.05020842978823]
Large Language Models (LLMs) have emerged as powerful tools to navigate this complex data landscape. RAGGED is a comprehensive workflow designed to support investigators with knowledge integration and hypothesis generation.
arXiv Detail & Related papers (2024-07-17T07:44:18Z)
Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications [6.832024637226738]
We propose a new task, which is the translation between drug molecules and corresponding indications. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases.
arXiv Detail & Related papers (2024-02-14T21:33:13Z)
Large Language Model Distilling Medication Recommendation Model [58.94186280631342]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)<n>Our research aims to transform existing medication recommendation methodologies using LLMs.<n>To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z)
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain [19.46334739319516]
We study how the dual logic ability of LLMs is affected during the privatization process in the medical domain. Our results indicate that incorporating general domain dual logic data into LLMs not only enhances LLMs' dual logic ability but also improves their accuracy.
arXiv Detail & Related papers (2023-09-08T08:20:46Z)
Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning. They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health. Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z)
Active Retrieval Augmented Generation [123.68874416084499]
Augmenting large language models (LMs) by retrieving information from external knowledge resources is one promising solution. Most existing retrieval augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input. We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic method which iteratively uses a prediction of the upcoming sentence to anticipate future content.
arXiv Detail & Related papers (2023-05-11T17:13:40Z)
Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine [68.7814360102644]
We propose the Re$3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning. We demonstrate the effectiveness of our method in generating patient discharge instructions.
arXiv Detail & Related papers (2022-10-23T16:34:39Z)
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.