Improved Molecular Generation through Attribute-Driven Integrative Embeddings and GAN Selectivity
- URL: http://arxiv.org/abs/2504.19040v1
- Date: Sat, 26 Apr 2025 22:15:25 GMT
- Title: Improved Molecular Generation through Attribute-Driven Integrative Embeddings and GAN Selectivity
- Authors: Nandan Joshi, Erhan Guven,
- Abstract summary: This paper introduces a transformer-based vector embedding generator combined with a modified Generative Adrialversa Network (GAN) to generate molecules with desired properties.<n>The embedding generator utilizes a novel molecular descriptor, integrating Morgan fingerprints with global molecular attributes.<n>The approach is validated by generating novel odorant molecules using a labeled dataset of odorant and non-odorant compounds.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The growing demand for molecules with tailored properties in fields such as drug discovery and chemical engineering has driven advancements in computational methods for molecular design. Machine learning-based approaches for de-novo molecular generation have recently garnered significant attention. This paper introduces a transformer-based vector embedding generator combined with a modified Generative Adversarial Network (GAN) to generate molecules with desired properties. The embedding generator utilizes a novel molecular descriptor, integrating Morgan fingerprints with global molecular attributes, enabling the transformer to capture local functional groups and broader molecular characteristics. Modifying the GAN generator loss function ensures the generation of molecules with specific desired properties. The transformer achieves a reconversion accuracy of 94% while translating molecular descriptors back to SMILES strings, validating the utility of the proposed embeddings for generative tasks. The approach is validated by generating novel odorant molecules using a labeled dataset of odorant and non-odorant compounds. With the modified range-loss function, the GAN exclusively generates odorant molecules. This work underscores the potential of combining novel vector embeddings with transformers and modified GAN architectures to accelerate the discovery of tailored molecules, offering a robust tool for diverse molecular design applications.
Related papers
- A Reinforcement Learning-Driven Transformer GAN for Molecular Generation [6.397243531623856]
This study introduces RL-MolGAN, a novel Transformer-based discrete GAN framework designed to address these challenges.<n>Unlike traditional Transformer, RL-MolGAN utilizes a first-decoder-then-encoder structure, facilitating the generation of drug-like molecules from both $denovo$ and scaffold-based designs.<n>In addition, RL-MolGAN integrates reinforcement learning (RL) and Monte Carlo tree search (MCTS) techniques to enhance the stability of GAN training and optimize the chemical properties of the generated molecules.
arXiv Detail & Related papers (2025-03-17T04:06:10Z) - De Novo Generation of Hit-like Molecules from Gene Expression Profiles via Deep Learning [3.9518122220368905]
We propose a hybrid neural network, HNN2Mol, to generate new molecules with potential bioactivities and drug-like properties.<n> Experimental results and case studies demonstrate that the proposed HNN2Mol model can produce new molecules with potential bioactivities and drug-like properties.
arXiv Detail & Related papers (2024-12-27T03:16:56Z) - Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model [77.50732023411811]
We propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM)
TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions.
Our approach surpasses state-of-the-art methods in optimizing molecular structural similarity and enhancing chemical properties on the benchmark dataset.
arXiv Detail & Related papers (2024-10-17T14:30:27Z) - When Molecular GAN Meets Byte-Pair Encoding [2.5398391570038736]
This study introduces a molecular GAN that integrates a byte level byte-pair encoding tokenizer and employs reinforcement learning to enhance de novo molecular generation.
Specifically, the generator functions as an actor, producing SMILES strings, while the discriminator acts as a critic, evaluating their quality.
arXiv Detail & Related papers (2024-09-29T15:39:26Z) - Using GNN property predictors as molecule generators [16.34646723046073]
Graph neural networks (GNNs) have emerged as powerful tools to accurately predict materials and molecular properties.
In this article, we exploit the invertible nature of these neural networks to directly generate molecular structures with desired electronic properties.
arXiv Detail & Related papers (2024-06-05T13:53:47Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - Molecule Design by Latent Space Energy-Based Modeling and Gradual
Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery.
We propose a probabilistic generative model to capture the joint distribution of molecules and their properties.
Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z) - MUDiff: Unified Diffusion for Complete Molecule Generation [104.7021929437504]
We present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates.
We propose a novel graph transformer architecture to denoise the diffusion process.
Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.
arXiv Detail & Related papers (2023-04-28T04:25:57Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z) - Geometric Transformer for End-to-End Molecule Properties Prediction [92.28929858529679]
We introduce a Transformer-based architecture for molecule property prediction, which is able to capture the geometry of the molecule.
We modify the classical positional encoder by an initial encoding of the molecule geometry, as well as a learned gated self-attention mechanism.
arXiv Detail & Related papers (2021-10-26T14:14:40Z) - Augmenting Molecular Deep Generative Models with Topological Data
Analysis Representations [21.237758981760784]
We present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules.
Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties.
arXiv Detail & Related papers (2021-06-08T15:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.