GFlowNet Pretraining with Inexpensive Rewards
- URL: http://arxiv.org/abs/2409.09702v1
- Date: Sun, 15 Sep 2024 11:42:17 GMT
- Title: GFlowNet Pretraining with Inexpensive Rewards
- Authors: Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio,
- Abstract summary: We introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively.
We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors.
We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties.
- Score: 2.924067540644439
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Flow Networks (GFlowNets), a class of generative models have recently emerged as a suitable framework for generating diverse and high-quality molecular structures by learning from unnormalized reward distributions. Previous works in this direction often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on the ZINC15 offline dataset and employ robust evaluation metrics to show the effectiveness of our approach when compared to other relevant baseline methods in drug design.
Related papers
- Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties.
It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - TAGMol: Target-Aware Gradient-guided Molecule Generation [19.977071499171903]
3D generative models have shown significant promise in structure-based drug design (SBDD)
We decouple the problem into molecular generation and property prediction.
The latter synergistically guides the diffusion sampling process, facilitating guided diffusion and resulting in the creation of meaningful molecules with the desired properties.
We call this guided molecular generation process as TAGMol.
arXiv Detail & Related papers (2024-06-03T14:43:54Z) - DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization [49.85944390503957]
DecompOpt is a structure-based molecular optimization method based on a controllable and diffusion model.
We show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines.
arXiv Detail & Related papers (2024-03-07T02:53:40Z) - Molecular De Novo Design through Transformer-based Reinforcement
Learning [38.803770968809225]
We introduce a method to fine-tune a Transformer-based generative model for molecular de novo design.
Our proposed method exhibits superior performance in generating compounds predicted to be active against various biological targets.
Our approach can be used for scaffold hopping, library expansion starting from a single molecule, and generating compounds with high predicted activity against biological targets.
arXiv Detail & Related papers (2023-10-09T02:51:01Z) - Leveraging Side Information for Ligand Conformation Generation using
Diffusion-Based Approaches [12.71967232020327]
Ligand molecule conformation generation is a critical challenge in drug discovery.
Deep learning models have been developed to tackle this problem.
These models often generate conformations that lack meaningful structure and randomness due to the absence of essential side information.
arXiv Detail & Related papers (2023-08-02T09:56:47Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z) - Target-aware Molecular Graph Generation [37.937378787812264]
We propose SiamFlow, which forces the flow to fit the distribution of target sequence embeddings in latent space.
Specifically, we employ an alignment loss and a uniform loss to bring target sequence embeddings and drug graph embeddings into agreements.
Experiments quantitatively show that our proposed method learns meaningful representations in the latent space toward the target-aware molecular graph generation.
arXiv Detail & Related papers (2022-02-10T04:31:14Z) - Reinforced Molecular Optimization with Neighborhood-Controlled Grammars [63.84003497770347]
We propose MNCE-RL, a graph convolutional policy network for molecular optimization.
We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation.
We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
arXiv Detail & Related papers (2020-11-14T05:42:15Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.