Reinforced Molecular Optimization with Neighborhood-Controlled Grammars
- URL: http://arxiv.org/abs/2011.07225v1
- Date: Sat, 14 Nov 2020 05:42:15 GMT
- Title: Reinforced Molecular Optimization with Neighborhood-Controlled Grammars
- Authors: Chencheng Xu, Qiao Liu, Minlie Huang, Tao Jiang
- Abstract summary: We propose MNCE-RL, a graph convolutional policy network for molecular optimization.
We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation.
We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
- Score: 63.84003497770347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major challenge in the pharmaceutical industry is to design novel molecules
with specific desired properties, especially when the property evaluation is
costly. Here, we propose MNCE-RL, a graph convolutional policy network for
molecular optimization with molecular neighborhood-controlled embedding
grammars through reinforcement learning. We extend the original
neighborhood-controlled embedding grammars to make them applicable to molecular
graph generation and design an efficient algorithm to infer grammatical
production rules from given molecules. The use of grammars guarantees the
validity of the generated molecular structures. By transforming molecular
graphs to parse trees with the inferred grammars, the molecular structure
generation task is modeled as a Markov decision process where a policy gradient
strategy is utilized. In a series of experiments, we demonstrate that our
approach achieves state-of-the-art performance in a diverse range of molecular
optimization tasks and exhibits significant superiority in optimizing molecular
properties with a limited number of property evaluations.
Related papers
- Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method.
HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution.
Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z) - MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures [2.5563339057415218]
MolIG is a novel MultiModaL molecular pre-training framework for predicting molecular properties based on Image and Graph structures.
It amalgamates the strengths of both molecular representation forms.
It exhibits enhanced performance in downstream tasks pertaining to molecular property prediction within benchmark groups.
arXiv Detail & Related papers (2023-11-28T10:28:35Z) - Extracting Molecular Properties from Natural Language with Multimodal
Contrastive Learning [1.3717673827807508]
We study how molecular property information can be transferred from natural language to graph representations.
We implement neural relevance scoring strategies to improve text retrieval, introduce a novel chemically-valid molecular graph augmentation strategy.
We achieve a +4.26% AUROC gain versus models pre-trained on the graph modality alone, and a +1.54% gain compared to recently proposed molecular graph/text contrastively trained MoMu model.
arXiv Detail & Related papers (2023-07-22T10:32:58Z) - Molecule Design by Latent Space Energy-Based Modeling and Gradual
Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery.
We propose a probabilistic generative model to capture the joint distribution of molecules and their properties.
Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - Fragment-based Sequential Translation for Molecular Optimization [23.152338167332374]
We propose a flexible editing paradigm that generates molecules using learned molecular fragments.
We use a variational autoencoder to encode molecular fragments in a coherent latent space.
We then utilize as a vocabulary for editing molecules to explore the complex chemical property space.
arXiv Detail & Related papers (2021-10-26T21:20:54Z) - Property-aware Adaptive Relation Networks for Molecular Property
Prediction [34.13439007658925]
We propose a property-aware adaptive relation networks (PAR) for the few-shot molecular property prediction problem.
Our PAR is compatible with existing graph-based molecular encoders, and are further equipped with the ability to obtain property-aware molecular embedding and model molecular relation graph.
arXiv Detail & Related papers (2021-07-16T16:22:30Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z) - Graph Polish: A Novel Graph Generation Paradigm for Molecular
Optimization [7.1696593196695035]
We present a novel molecular optimization paradigm, Graph Polish, which changes molecular optimization from the traditional "two-language translating" task into a "single-language" task.
We propose an effective and efficient learning framework T&S polish to capture the long-term dependencies in the optimization steps.
arXiv Detail & Related papers (2020-08-14T08:36:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.