Related papers: Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications

Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications

URL: http://arxiv.org/abs/2402.09588v2
Date: Fri, 16 Feb 2024 20:55:08 GMT
Title: Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications
Authors: David Oniani, Jordan Hilsman, Chengxi Zang, Junmei Wang, Lianjin Cai, Jan Zawala, Yanshan Wang
Abstract summary: We propose a new task, which is the translation between drug molecules and corresponding indications. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases.
Score: 6.832024637226738
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A drug molecule is a substance that changes the organism's mental or physical state. Every approved drug has an indication, which refers to the therapeutic use of that drug for treating a particular medical condition. While the Large Language Model (LLM), a generative Artificial Intelligence (AI) technique, has recently demonstrated effectiveness in translating between molecules and their textual descriptions, there remains a gap in research regarding their application in facilitating the translation between drug molecules and indications, or vice versa, which could greatly benefit the drug discovery process. The capability of generating a drug from a given indication would allow for the discovery of drugs targeting specific diseases or targets and ultimately provide patients with better treatments. In this paper, we first propose a new task, which is the translation between drug molecules and corresponding indications, and then test existing LLMs on this new task. Specifically, we consider nine variations of the T5 LLM and evaluate them on two public datasets obtained from ChEMBL and DrugBank. Our experiments show the early results of using LLMs for this task and provide a perspective on the state-of-the-art. We also emphasize the current limitations and discuss future work that has the potential to improve the performance on this task. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases and significantly reduce the cost of drug discovery, with the potential to revolutionize the field of drug discovery in the era of generative AI.

Related papers

Do "New Snow Tablets" Contain Snow? Large Language Models Over-Rely on Names to Identify Ingredients of Chinese Drugs [79.00288739947406]
Traditional Chinese Medicine (TCM) has seen increasing adoption in healthcare, with specialized Large Language Models (LLMs) emerging to support clinical applications. A fundamental requirement for these models is accurate identification of TCM drug ingredients. Our systematic analysis reveals consistent failure patterns: models often interpret drug names literally, overuse common herbs regardless of relevance, and exhibit erratic behaviors when faced with unfamiliar formulations.
arXiv Detail & Related papers (2025-04-03T17:43:45Z)
PharmAgents: Building a Virtual Pharma with Large Language Model Agents [19.589707628042422]
We introduce PharmAgents, a virtual pharmaceutical ecosystem driven by multi-agent collaboration. The system integrates explainable, LLM-driven agents equipped with specialized machine learning models and computational tools. It identifies potential therapeutic targets, discovers promising lead compounds, enhances binding affinity and key molecular properties, and performs in silico analyses of toxicity and synthetic feasibility.
arXiv Detail & Related papers (2025-03-28T06:02:53Z)
KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model [16.712453010522673]
We utilize open-source drug knowledge graphs, clinical trial data, and PubMed publications to construct a comprehensive dataset for the explainable drug discovery task. We introduce textbfKEDRec-LM, an instruction-tuned LLM which distills knowledge from rich medical knowledge corpus for drug recommendation and rationale generation.
arXiv Detail & Related papers (2025-02-27T18:22:33Z)
Small Molecule Drug Discovery Through Deep Learning:Progress, Challenges, and Opportunities [34.72068278499029]
With the rapid development of deep learning (DL) techniques, DL-based small molecule drug discovery methods have achieved excellent performance. This paper systematically summarize and generalize the recent key tasks and representative techniques in DL-based small molecule drug discovery.
arXiv Detail & Related papers (2025-02-13T05:24:52Z)
DrugAgent: Automating AI-aided Drug Discovery Programming through LLM Multi-Agent Collaboration [24.65716292347949]
DrugAgent is a multi-agent framework that automates machine learning (ML) programming for drug discovery tasks. Our results show that DrugAgent consistently outperforms leading baselines.
arXiv Detail & Related papers (2024-11-24T03:06:59Z)
Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development [24.5979645373074]
Y-Mol is a knowledge-guided LLM designed to accomplish tasks across lead compound discovery, pre-clinic, and clinic prediction. It learns from a corpus of publications, knowledge graphs, and expert-designed synthetic data. Y-Mol significantly outperforms general-purpose LLMs in discovering lead compounds, predicting molecular properties, and identifying drug interaction events.
arXiv Detail & Related papers (2024-10-15T12:39:20Z)
Large Language Models in Drug Discovery and Development: From Disease Mechanisms to Clinical Trials [49.19897427783105]
The integration of Large Language Models (LLMs) into the drug discovery and development field marks a significant paradigm shift. We investigate how these advanced computational models can uncover target-disease linkage, interpret complex biomedical data, enhance drug molecule design, predict drug efficacy and safety profiles, and facilitate clinical trial processes.
arXiv Detail & Related papers (2024-09-06T02:03:38Z)
MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance [17.008132675107355]
This paper focuses on the problem of Pharmacovigilance (PhV), where the significance and challenges lie in identifying Adverse Drug Events (ADEs) from diverse text sources. We present MALADE, the first effective collaborative multi-agent system powered by Large Language Models with Retrieval Augmented Generation for ADE extraction from drug label data.
arXiv Detail & Related papers (2024-08-03T22:14:13Z)
DrugCLIP: Contrastive Drug-Disease Interaction For Drug Repurposing [4.969453745531116]
DrugCLIP is a contrastive learning method to learn drug and disease's interaction without negative labels. We have curated a drug repurposing dataset based on real-world clinical trial records.
arXiv Detail & Related papers (2024-07-02T13:41:59Z)
Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs) Our research aims to transform existing medication recommendation methodologies using LLMs. To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z)
NeuroCADR: Drug Repurposing to Reveal Novel Anti-Epileptic Drug Candidates Through an Integrated Computational Approach [0.0]
Drug repurposing is an emerging approach for drug discovery involving the reassignment of existing drugs for novel purposes. A proposed algorithm is NeuroCADR, a novel system for drug repurposing via a multi-pronged approach consisting of k-nearest neighbor algorithms (KNN), random forest classification, and decision trees. Data was sourced from several databases consisting of interactions between diseases, symptoms, genes, and affiliated drug molecules, which were then compiled into datasets expressed in binary. NeuroCADR identified novel drug candidates for epilepsy that can be further approved through clinical trials.
arXiv Detail & Related papers (2023-09-04T03:21:43Z)
SynerGPT: In-Context Learning for Personalized Drug Synergy Prediction and Drug Design [64.69434941796904]
We propose a novel setting and models for in-context drug synergy learning. We are given a small "personalized dataset" of 10-20 drug synergy relationships in the context of specific cancer cell targets. Our goal is to predict additional drug synergy relationships in that context.
arXiv Detail & Related papers (2023-06-19T17:03:46Z)
Knowledge-Driven New Drug Recommendation [88.35607943144261]
We develop a drug-dependent multi-phenotype few-shot learner to bridge the gap between existing and new drugs. EDGE eliminates the false-negative supervision signal using an external drug-disease knowledge base. Results show that EDGE achieves 7.3% improvement on the ROC-AUC score over the best baseline.
arXiv Detail & Related papers (2022-10-11T16:07:52Z)
SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery. wet experiments remain the most reliable method, but they are time-consuming and resource-intensive. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning [61.74958429818077]
MolDesigner is a human-in-the-loop web user-interface (UI) for drug developers. A developer can draw a drug molecule in the interface. In the backend, more than 17 state-of-the-art DL models generate predictions on important indices that are crucial for a drug's efficacy.
arXiv Detail & Related papers (2020-10-05T21:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.