Related papers: GFlowNet Pretraining with Inexpensive Rewards

GFlowNet Pretraining with Inexpensive Rewards

URL: http://arxiv.org/abs/2409.09702v1
Date: Sun, 15 Sep 2024 11:42:17 GMT
Title: GFlowNet Pretraining with Inexpensive Rewards
Authors: Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio,
Abstract summary: We introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties.
Score: 2.924067540644439
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative Flow Networks (GFlowNets), a class of generative models have recently emerged as a suitable framework for generating diverse and high-quality molecular structures by learning from unnormalized reward distributions. Previous works in this direction often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on the ZINC15 offline dataset and employ robust evaluation metrics to show the effectiveness of our approach when compared to other relevant baseline methods in drug design.

Related papers

Combining Graph Neural Networks and Mixed Integer Linear Programming for Molecular Inference under the Two-Layered Model [6.107266553770076]
We develop a molecular inference framework based on mol-infer, namely mol-infer-GNN, that utilizes GNN as the learning method.<n>Our proposed GNN model can obtain satisfying learning performances for some properties despite its simple structure.
arXiv Detail & Related papers (2025-07-05T06:57:37Z)
Learning Hierarchical Interaction for Accurate Molecular Property Prediction [8.488251667425887]
We propose a Hierarchical Interaction Message Passing Mechanism, which serves as the foundation of our novel model, HimNet. Our method enables interaction-aware representation learning across atomic, motif, and molecular levels via hierarchical attention-guided message passing. Our method exhibits promising hierarchical interpretability, aligning well with chemical intuition on representative molecules.
arXiv Detail & Related papers (2025-04-28T15:19:28Z)
Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation [6.495442425890008]
Generative Flow Networks (GFlowNets) have recently emerged as a suitable framework for generating diverse and high-quality molecular structures. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks. We propose an unsupervised pre-training approach using drug-like molecule datasets, which teaches A-GFNs about inexpensive yet informative molecular descriptors.
arXiv Detail & Related papers (2025-03-08T20:41:07Z)
FragFM: Hierarchical Framework for Efficient Molecule Generation via Fragment-Level Discrete Flow Matching [3.0684068038799728]
We introduce FragFM, a novel hierarchical framework via fragment-level discrete flow matching for efficient molecular graph generation.<n>FragFM generates molecules at the fragment level, leveraging a coarse-to-fine autoencoder to reconstruct details at the atom level.<n>We also propose a Natural Product Generation benchmark to evaluate modern molecular graph generative models' ability to generate natural product-like molecules.
arXiv Detail & Related papers (2025-02-19T07:01:00Z)
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization [53.27954325490941]
Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research introduces a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model.
arXiv Detail & Related papers (2025-02-11T04:00:21Z)
GenMol: A Drug Discovery Generalist with Discrete Diffusion [43.29814519270451]
Generalist Molecular generative model (GenMol) is a versatile framework that addresses various aspects of the drug discovery pipeline. Under the discrete diffusion framework, we introduce fragment remasking, a strategy that optimize molecules by replacing fragments with masked tokens. GenMol significantly outperforms the previous GPT-based model trained on SAFE representations in de novo generation and fragment-constrained generation.
arXiv Detail & Related papers (2025-01-10T18:30:05Z)
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties. It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z)
TAGMol: Target-Aware Gradient-guided Molecule Generation [19.977071499171903]
3D generative models have shown significant promise in structure-based drug design (SBDD) We decouple the problem into molecular generation and property prediction. The latter synergistically guides the diffusion sampling process, facilitating guided diffusion and resulting in the creation of meaningful molecules with the desired properties. We call this guided molecular generation process as TAGMol.
arXiv Detail & Related papers (2024-06-03T14:43:54Z)
DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization [49.85944390503957]
DecompOpt is a structure-based molecular optimization method based on a controllable and diffusion model. We show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines.
arXiv Detail & Related papers (2024-03-07T02:53:40Z)
Molecular De Novo Design through Transformer-based Reinforcement Learning [38.803770968809225]
We introduce a method to fine-tune a Transformer-based generative model for molecular de novo design. Our proposed method exhibits superior performance in generating compounds predicted to be active against various biological targets. Our approach can be used for scaffold hopping, library expansion starting from a single molecule, and generating compounds with high predicted activity against biological targets.
arXiv Detail & Related papers (2023-10-09T02:51:01Z)
Leveraging Side Information for Ligand Conformation Generation using Diffusion-Based Approaches [12.71967232020327]
Ligand molecule conformation generation is a critical challenge in drug discovery. Deep learning models have been developed to tackle this problem. These models often generate conformations that lack meaningful structure and randomness due to the absence of essential side information.
arXiv Detail & Related papers (2023-08-02T09:56:47Z)
Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation. We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z)
Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE) Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor. We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z)
Target-aware Molecular Graph Generation [37.937378787812264]
We propose SiamFlow, which forces the flow to fit the distribution of target sequence embeddings in latent space. Specifically, we employ an alignment loss and a uniform loss to bring target sequence embeddings and drug graph embeddings into agreements. Experiments quantitatively show that our proposed method learns meaningful representations in the latent space toward the target-aware molecular graph generation.
arXiv Detail & Related papers (2022-02-10T04:31:14Z)
Reinforced Molecular Optimization with Neighborhood-Controlled Grammars [63.84003497770347]
We propose MNCE-RL, a graph convolutional policy network for molecular optimization. We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation. We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
arXiv Detail & Related papers (2020-11-14T05:42:15Z)
MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties. We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.