Towards Unified AI Drug Discovery with Multiple Knowledge Modalities
- URL: http://arxiv.org/abs/2305.01523v2
- Date: Sat, 14 Oct 2023 05:49:33 GMT
- Title: Towards Unified AI Drug Discovery with Multiple Knowledge Modalities
- Authors: Yizhen Luo, Xing Yi Liu, Kai Yang, Kui Huang, Massimo Hong, Jiahuan
Zhang, Yushuai Wu, Zaiqing Nie
- Abstract summary: We propose KEDD, a unified, end-to-end, and multimodal deep learning framework.
It optimally incorporates both structured and unstructured knowledge for vast AI drug discovery tasks.
Our framework achieves a deeper understanding of molecule entities, brings significant improvements over state-of-the-art methods.
- Score: 5.232382666884214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, AI models that mine intrinsic patterns from molecular
structures and protein sequences have shown promise in accelerating drug
discovery. However, these methods partly lag behind real-world pharmaceutical
approaches of human experts that additionally grasp structured knowledge from
knowledge bases and unstructured knowledge from biomedical literature. To
bridge this gap, we propose KEDD, a unified, end-to-end, and multimodal deep
learning framework that optimally incorporates both structured and unstructured
knowledge for vast AI drug discovery tasks. The framework first extracts
underlying characteristics from heterogeneous inputs, and then applies
multimodal fusion for accurate prediction. To mitigate the problem of missing
modalities, we leverage multi-head sparse attention and a modality masking
mechanism to extract relevant information robustly. Benefiting from integrated
knowledge, our framework achieves a deeper understanding of molecule entities,
brings significant improvements over state-of-the-art methods on a wide range
of tasks and benchmarks, and reveals its promising potential in assisting
real-world drug discovery.
Related papers
- Enhancing Biomedical Knowledge Discovery for Diseases: An End-To-End Open-Source Framework [28.68816381566995]
We introduce an open-source framework designed to construct knowledge around specific diseases directly from raw text.
To facilitate research in disease-related knowledge discovery, we create two annotated datasets focused on Rett syndrome and Alzheimer's disease.
arXiv Detail & Related papers (2024-07-18T13:20:53Z) - Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion [51.80447197290866]
Multi-modal knowledge graph completion (MMKGC) aims to automatically discover new knowledge triples in the given multi-modal knowledge graphs (MMKGs)
Existing methods tend to focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts.
We introduce a novel MMKGC framework with Mixture of Modality Knowledge experts (MoMoK) to learn adaptive multi-modal embedding under intricate relational contexts.
arXiv Detail & Related papers (2024-05-27T06:36:17Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - InstructMol: Multi-Modal Integration for Building a Versatile and
Reliable Molecular Assistant in Drug Discovery [19.870192393785043]
Large Language Models (LLMs) offer promise in reshaping interactions with complex molecular data.
Our novel contribution, InstructMol, effectively aligns molecular structures with natural language via an instruction-tuning approach.
InstructMol showcases substantial performance improvements in drug discovery-related molecular tasks.
arXiv Detail & Related papers (2023-11-27T16:47:51Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented
Prompting [33.1455954220194]
HiPrompt is a supervision-efficient knowledge fusion framework.
It elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts.
Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.
arXiv Detail & Related papers (2023-04-12T16:54:26Z) - Knowledge-augmented Graph Machine Learning for Drug Discovery: A Survey [6.288056740658763]
Graph Machine Learning (GML) has gained considerable attention for its exceptional ability to model graph-structured biomedical data.
Recent studies have proposed integrating external biomedical knowledge into the GML pipeline to realise more precise and interpretable drug discovery.
arXiv Detail & Related papers (2023-02-16T12:38:01Z) - Structure-based drug discovery with deep learning [0.0]
Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology.
This review summarizes the most prominent algorithmic concepts in structure-based deep learning for drug discovery.
arXiv Detail & Related papers (2022-12-26T20:52:26Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z) - Explainable Deep Relational Networks for Predicting Compound-Protein
Affinities and Contacts [80.69440684790925]
DeepRelations is a physics-inspired deep relational network with intrinsically explainable architecture.
It shows superior interpretability to the state-of-the-art.
It boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets.
arXiv Detail & Related papers (2019-12-29T00:14:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.