A Reinforcement Learning-Driven Transformer GAN for Molecular Generation
- URL: http://arxiv.org/abs/2503.12796v1
- Date: Mon, 17 Mar 2025 04:06:10 GMT
- Title: A Reinforcement Learning-Driven Transformer GAN for Molecular Generation
- Authors: Chen Li, Huidong Tang, Ye Zhu, Yoshihiro Yamanishi,
- Abstract summary: This study introduces RL-MolGAN, a novel Transformer-based discrete GAN framework designed to address these challenges.<n>Unlike traditional Transformer, RL-MolGAN utilizes a first-decoder-then-encoder structure, facilitating the generation of drug-like molecules from both $denovo$ and scaffold-based designs.<n>In addition, RL-MolGAN integrates reinforcement learning (RL) and Monte Carlo tree search (MCTS) techniques to enhance the stability of GAN training and optimize the chemical properties of the generated molecules.
- Score: 6.397243531623856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating molecules with desired chemical properties presents a critical challenge in fields such as chemical synthesis and drug discovery. Recent advancements in artificial intelligence (AI) and deep learning have significantly contributed to data-driven molecular generation. However, challenges persist due to the inherent sensitivity of simplified molecular input line entry system (SMILES) representations and the difficulties in applying generative adversarial networks (GANs) to discrete data. This study introduces RL-MolGAN, a novel Transformer-based discrete GAN framework designed to address these challenges. Unlike traditional Transformer architectures, RL-MolGAN utilizes a first-decoder-then-encoder structure, facilitating the generation of drug-like molecules from both $de~novo$ and scaffold-based designs. In addition, RL-MolGAN integrates reinforcement learning (RL) and Monte Carlo tree search (MCTS) techniques to enhance the stability of GAN training and optimize the chemical properties of the generated molecules. To further improve the model's performance, RL-MolWGAN, an extension of RL-MolGAN, incorporates Wasserstein distance and mini-batch discrimination, which together enhance the stability of the GAN. Experimental results on two widely used molecular datasets, QM9 and ZINC, validate the effectiveness of our models in generating high-quality molecular structures with diverse and desirable chemical properties.
Related papers
- Improved Molecular Generation through Attribute-Driven Integrative Embeddings and GAN Selectivity [0.0]
This paper introduces a transformer-based vector embedding generator combined with a modified Generative Adrialversa Network (GAN) to generate molecules with desired properties.
The embedding generator utilizes a novel molecular descriptor, integrating Morgan fingerprints with global molecular attributes.
The approach is validated by generating novel odorant molecules using a labeled dataset of odorant and non-odorant compounds.
arXiv Detail & Related papers (2025-04-26T22:15:25Z) - Auxiliary Discrminator Sequence Generative Adversarial Networks (ADSeqGAN) for Few Sample Molecule Generation [0.6339750087526286]
Auxiliary Discriminator Sequence Generative Adversarial Networks (ADSeqGAN) is a novel approach for molecular generation in small-sample datasets.<n>Our method incorporates pretrained generator and Wasserstein distance to enhance training stability and diversity.<n>We have demonstrated the successful applications of ADSeqGAN in generating synthetic nucleic acid-targeting and CNS drugs.
arXiv Detail & Related papers (2025-02-23T05:22:53Z) - DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
DiffMS is a formula-restricted encoder-decoder generative network.<n>We develop a robust decoder that bridges latent embeddings and molecular structures.<n>Experiments show DiffMS outperforms existing models on $textitde novo$ molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z) - Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - When Molecular GAN Meets Byte-Pair Encoding [2.5398391570038736]
This study introduces a molecular GAN that integrates a byte level byte-pair encoding tokenizer and employs reinforcement learning to enhance de novo molecular generation.
Specifically, the generator functions as an actor, producing SMILES strings, while the discriminator acts as a critic, evaluating their quality.
arXiv Detail & Related papers (2024-09-29T15:39:26Z) - Molecular Generative Adversarial Network with Multi-Property Optimization [3.0001188337985236]
Deep generative models, such as generative adversarial networks (GANs), have been employed for $denovo$ molecular generation in drug discovery.
This study introduces a novel GAN based on actor-critic RL with instant and global rewards, called InstGAN, to generate molecules at the token-level with multi-property optimization.
arXiv Detail & Related papers (2024-03-29T08:55:39Z) - Molecular De Novo Design through Transformer-based Reinforcement
Learning [38.803770968809225]
We introduce a method to fine-tune a Transformer-based generative model for molecular de novo design.
Our proposed method exhibits superior performance in generating compounds predicted to be active against various biological targets.
Our approach can be used for scaffold hopping, library expansion starting from a single molecule, and generating compounds with high predicted activity against biological targets.
arXiv Detail & Related papers (2023-10-09T02:51:01Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Augmenting Molecular Deep Generative Models with Topological Data
Analysis Representations [21.237758981760784]
We present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules.
Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties.
arXiv Detail & Related papers (2021-06-08T15:49:21Z) - Optimizing Molecules using Efficient Queries from Property Evaluations [66.66290256377376]
We propose QMO, a generic query-based molecule optimization framework.
QMO improves the desired properties of an input molecule based on efficient queries.
We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules.
arXiv Detail & Related papers (2020-11-03T18:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.