Realistic molecule optimization on a learned graph manifold
- URL: http://arxiv.org/abs/2106.13318v1
- Date: Thu, 3 Jun 2021 07:39:35 GMT
- Title: Realistic molecule optimization on a learned graph manifold
- Authors: R\'emy Brossard, Oriel Frigo, David Dehaene
- Abstract summary: We show that learned realism sampling produces empirically more realistic molecules and outperforms all recent baselines in the task of molecule optimization with similarity constraints.
In this work we use a hybrid approach, where the dataset distribution is learned using an autoregressive model while the score optimization is done using the Metropolis algorithm.
- Score: 4.640835690336652
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning based molecular graph generation and optimization has recently
been attracting attention due to its great potential for de novo drug design.
On the one hand, recent models are able to efficiently learn a given graph
distribution, and many approaches have proven very effective to produce a
molecule that maximizes a given score. On the other hand, it was shown by
previous studies that generated optimized molecules are often unrealistic, even
with the inclusion of mechanics to enforce similarity to a dataset of real drug
molecules. In this work we use a hybrid approach, where the dataset
distribution is learned using an autoregressive model while the score
optimization is done using the Metropolis algorithm, biased toward the learned
distribution. We show that the resulting method, that we call learned realism
sampling (LRS), produces empirically more realistic molecules and outperforms
all recent baselines in the task of molecule optimization with similarity
constraints.
Related papers
- Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model [77.50732023411811]
We propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM)
TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions.
Our approach surpasses state-of-the-art methods in optimizing molecular structural similarity and enhancing chemical properties on the benchmark dataset.
arXiv Detail & Related papers (2024-10-17T14:30:27Z) - MING: A Functional Approach to Learning Molecular Generative Models [46.189683355768736]
This paper introduces a novel paradigm for learning molecule generative models based on functional representations.
We propose Molecular Implicit Neural Generation (MING), a diffusion-based model that learns molecular distributions in function space.
arXiv Detail & Related papers (2024-10-16T13:02:02Z) - Diversity-Aware Reinforcement Learning for de novo Drug Design [2.356290293311623]
Fine-tuning a pre-trained generative model has demonstrated good performance in generating promising drug molecules.
No study has examined how different adaptive update mechanisms for the reward function influence the diversity of generated molecules.
Our experiments reveal that combining structure- and prediction-based methods generally yields better results in terms of molecular diversity.
arXiv Detail & Related papers (2024-10-14T12:25:23Z) - Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method.
HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution.
Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Differentiable Scaffolding Tree for Molecular Optimization [47.447362691543304]
We propose differentiable scaffolding tree (DST) that utilizes a learned knowledge network to convert discrete chemical structures to locally differentiable ones.
Our empirical studies show the gradient-based molecular optimizations are both effective and sample efficient.
arXiv Detail & Related papers (2021-09-22T01:16:22Z) - Molecule Optimization via Fragment-based Generative Models [21.888942129750124]
In drug discovery, molecule optimization is an important step in order to modify drug candidates into better ones in terms of desired drug properties.
We present an innovative in silico approach to computationally optimizing molecules and formulate the problem as to generate optimized molecular graphs.
Our generative models follow the key idea of fragment-based drug design, and optimize molecules by modifying their small fragments.
arXiv Detail & Related papers (2020-12-08T05:52:16Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.