Otter-Knowledge: benchmarks of multimodal knowledge graph representation
learning from different sources for drug discovery
- URL: http://arxiv.org/abs/2306.12802v3
- Date: Thu, 19 Oct 2023 18:15:57 GMT
- Title: Otter-Knowledge: benchmarks of multimodal knowledge graph representation
learning from different sources for drug discovery
- Authors: Hoang Thanh Lam, Marco Luca Sbodio, Marcos Mart\'inez Galindo,
Mykhaylo Zayats, Ra\'ul Fern\'andez-D\'iaz, V\'ictor Valls, Gabriele Picco,
Cesar Berrospi Ramis, Vanessa L\'opez
- Abstract summary: We release a set of multimodal knowledge graphs, integrating data from seven public data sources, and containing over 30 million triples.
Our intention is to foster additional research to explore how multimodal knowledge enhanced protein/molecule embeddings can improve prediction tasks.
- Score: 5.913390277336069
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research on predicting the binding affinity between drug molecules and
proteins use representations learned, through unsupervised learning techniques,
from large databases of molecule SMILES and protein sequences. While these
representations have significantly enhanced the predictions, they are usually
based on a limited set of modalities, and they do not exploit available
knowledge about existing relations among molecules and proteins. In this study,
we demonstrate that by incorporating knowledge graphs from diverse sources and
modalities into the sequences or SMILES representation, we can further enrich
the representation and achieve state-of-the-art results for drug-target binding
affinity prediction in the established Therapeutic Data Commons (TDC)
benchmarks. We release a set of multimodal knowledge graphs, integrating data
from seven public data sources, and containing over 30 million triples. Our
intention is to foster additional research to explore how multimodal knowledge
enhanced protein/molecule embeddings can improve prediction tasks, including
prediction of binding affinity. We also release some pretrained models learned
from our multimodal knowledge graphs, along with source code for running
standard benchmark tasks for prediction of biding affinity.
Related papers
- Representation-Enhanced Neural Knowledge Integration with Application to Large-Scale Medical Ontology Learning [3.010503480024405]
We propose a theoretically guaranteed statistical framework, called RENKI, to enable simultaneous learning of relation types.
The proposed framework incorporates representation learning output into initial entity embedding of a neural network that approximates the score function for the knowledge graph.
We demonstrate the effect of weighting in the presence of heterogeneous relations and the benefit of incorporating representation learning in nonparametric models.
arXiv Detail & Related papers (2024-10-09T21:38:48Z) - MKDTI: Predicting drug-target interactions via multiple kernel fusion on graph attention network [37.40418564922425]
We formulate a model called MKDTI by extracting kernel information from various layer embeddings of a graph attention network.
We use a Dual Laplacian Regularized Least Squares framework to forecast novel drug-target entity connections.
arXiv Detail & Related papers (2024-07-14T02:53:25Z) - Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity)
We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks.
Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - Multi-modal Graph Learning for Disease Prediction [35.4310911850558]
We propose an end-to-end Multimodal Graph Learning framework (MMGL) for disease prediction.
Instead of defining the adjacency matrix manually as existing methods, the latent graph structure can be captured through a novel way of adaptive graph learning.
arXiv Detail & Related papers (2021-07-01T03:59:22Z) - Exploring the Limits of Few-Shot Link Prediction in Knowledge Graphs [49.6661602019124]
We study a spectrum of models derived by generalizing the current state of the art for few-shot link prediction.
We find that a simple zero-shot baseline - which ignores any relation-specific information - achieves surprisingly strong performance.
Experiments on carefully crafted synthetic datasets show that having only a few examples of a relation fundamentally limits models from using fine-grained structural information.
arXiv Detail & Related papers (2021-02-05T21:04:31Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - Explainable Deep Relational Networks for Predicting Compound-Protein
Affinities and Contacts [80.69440684790925]
DeepRelations is a physics-inspired deep relational network with intrinsically explainable architecture.
It shows superior interpretability to the state-of-the-art.
It boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets.
arXiv Detail & Related papers (2019-12-29T00:14:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.