MolKD: Distilling Cross-Modal Knowledge in Chemical Reactions for
Molecular Property Prediction
- URL: http://arxiv.org/abs/2305.01912v1
- Date: Wed, 3 May 2023 06:01:03 GMT
- Title: MolKD: Distilling Cross-Modal Knowledge in Chemical Reactions for
Molecular Property Prediction
- Authors: Liang Zeng, Lanqing Li, Jian Li
- Abstract summary: How to effectively represent molecules is a long-standing challenge for molecular property prediction and drug discovery.
This paper proposes to incorporate chemical domain knowledge, specifically related to chemical reactions, for learning effective molecular representations.
We introduce a novel method, namely MolKD, which Distills cross-modal Knowledge in chemical reactions to assist Molecular property prediction.
- Score: 9.100067773907403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to effectively represent molecules is a long-standing challenge for
molecular property prediction and drug discovery. This paper studies this
problem and proposes to incorporate chemical domain knowledge, specifically
related to chemical reactions, for learning effective molecular
representations. However, the inherent cross-modality property between chemical
reactions and molecules presents a significant challenge to address. To this
end, we introduce a novel method, namely MolKD, which Distills cross-modal
Knowledge in chemical reactions to assist Molecular property prediction.
Specifically, the reaction-to-molecule distillation model within MolKD
transfers cross-modal knowledge from a pre-trained teacher network learning
with one modality (i.e., reactions) into a student network learning with
another modality (i.e., molecules). Moreover, MolKD learns effective molecular
representations by incorporating reaction yields to measure transformation
efficiency of the reactant-product pair when pre-training on reactions.
Extensive experiments demonstrate that MolKD significantly outperforms various
competitive baseline models, e.g., 2.1% absolute AUC-ROC gain on Tox21. Further
investigations demonstrate that pre-trained molecular representations in MolKD
can distinguish chemically reasonable molecular similarities, which enables
molecular property prediction with high robustness and interpretability.
Related papers
- DrugLLM: Open Large Language Model for Few-shot Molecule Generation [20.680942401843772]
DrugLLM learns how to modify molecules in drug discovery by predicting the next molecule based on past modifications.
In computational experiments, DrugLLM can generate new molecules with expected properties based on limited examples.
arXiv Detail & Related papers (2024-05-07T09:18:13Z) - Atom-Motif Contrastive Transformer for Molecular Property Prediction [68.85399466928976]
Graph Transformer (GT) models have been widely used in the task of Molecular Property Prediction (MPP)
We propose a novel Atom-Motif Contrastive Transformer (AMCT) which explores atom-level interactions and considers motif-level interactions.
Our proposed AMCT is extensively evaluated on seven popular benchmark datasets, and both quantitative and qualitative results firmly demonstrate its effectiveness.
arXiv Detail & Related papers (2023-10-11T10:03:10Z) - MolCAP: Molecular Chemical reActivity pretraining and
prompted-finetuning enhanced molecular representation learning [3.179128580341411]
MolCAP is a graph pretraining Transformer based on chemical reactivity (IMR) knowledge with prompted finetuning.
Prompted by MolCAP, even basic graph neural networks are capable of achieving surprising performance that outperforms previous models.
arXiv Detail & Related papers (2023-06-13T13:48:06Z) - Molecule Design by Latent Space Energy-Based Modeling and Gradual
Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery.
We propose a probabilistic generative model to capture the joint distribution of molecules and their properties.
Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Materials Discovery with Extreme Properties via Reinforcement Learning-Guided Combinatorial Chemistry [0.23301643766310373]
Rule-based molecular designer driven by trained policy for selecting subsequent molecular fragments to get a target molecule.
In an experiment aimed at discovering molecules that hit seven extreme target properties, our model discovered 1,315 of all target-hitting molecules.
It has been confirmed that every molecule generated under the binding rules of molecular fragments is 100% chemically valid.
arXiv Detail & Related papers (2023-03-21T13:21:43Z) - Improving Molecular Pretraining with Complementary Featurizations [20.86159731100242]
Molecular pretraining is a paradigm to solve a variety of tasks in computational chemistry and drug discovery.
We show that different featurization techniques convey chemical information differently.
We propose a simple and effective MOlecular pretraining framework with COmplementary featurizations (MOCO)
arXiv Detail & Related papers (2022-09-29T21:11:09Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z) - Chemical-Reaction-Aware Molecule Representation Learning [88.79052749877334]
We propose using chemical reactions to assist learning molecule representation.
Our approach is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings.
Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks.
arXiv Detail & Related papers (2021-09-21T00:08:43Z) - Property-aware Adaptive Relation Networks for Molecular Property
Prediction [34.13439007658925]
We propose a property-aware adaptive relation networks (PAR) for the few-shot molecular property prediction problem.
Our PAR is compatible with existing graph-based molecular encoders, and are further equipped with the ability to obtain property-aware molecular embedding and model molecular relation graph.
arXiv Detail & Related papers (2021-07-16T16:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.