Active Learning-Guided Seq2Seq Variational Autoencoder for Multi-target Inhibitor Generation
- URL: http://arxiv.org/abs/2506.15309v1
- Date: Wed, 18 Jun 2025 09:39:51 GMT
- Title: Active Learning-Guided Seq2Seq Variational Autoencoder for Multi-target Inhibitor Generation
- Authors: JĂșlia Vilalta-Mor, Alexis Molina, Laura Ortega Varga, Isaac Filella-Merce, Victor Guallar,
- Abstract summary: We propose a structured active learning paradigm to balance chemical diversity, molecular quality, and multi-target affinity.<n>Our method alternates between expanding chemically feasible regions of latent space and progressively constraining molecules.<n>We demonstrate that careful timing and strategic placement of chemical filters within this active learning pipeline markedly enhance exploration of beneficial chemical space.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Simultaneously optimizing molecules against multiple therapeutic targets remains a profound challenge in drug discovery, particularly due to sparse rewards and conflicting design constraints. We propose a structured active learning (AL) paradigm integrating a sequence-to-sequence (Seq2Seq) variational autoencoder (VAE) into iterative loops designed to balance chemical diversity, molecular quality, and multi-target affinity. Our method alternates between expanding chemically feasible regions of latent space and progressively constraining molecules based on increasingly stringent multi-target docking thresholds. In a proof-of-concept study targeting three related coronavirus main proteases (SARS-CoV-2, SARS-CoV, MERS-CoV), our approach efficiently generated a structurally diverse set of pan-inhibitor candidates. We demonstrate that careful timing and strategic placement of chemical filters within this active learning pipeline markedly enhance exploration of beneficial chemical space, transforming the sparse-reward, multi-objective drug design problem into an accessible computational task. Our framework thus provides a generalizable roadmap for efficiently navigating complex polypharmacological landscapes.
Related papers
- MODA: A Unified 3D Diffusion Framework for Multi-Task Target-Aware Molecular Generation [16.07694748790297]
We introduce MODA, a diffusion framework that unifies fragment growing, linker design, scaffold hopping, and side-chain decoration with a Bayesian mask scheduler.<n>During training, a contiguous spatial fragment is masked and then denoised in one pass, enabling the model to learn shared geometric and chemical priors across tasks.
arXiv Detail & Related papers (2025-07-09T18:19:50Z) - Active Learning on Synthons for Molecular Design [0.0]
We introduce Scalable Active Learning via Synthon Acquisition (SALSA), a simple algorithm applicable to multi-vector expansion.<n>SALSA extends pool-based active learning to non-enumerable spaces by factoring modeling and acquisition over synthon or fragment choices.<n>We show that SALSA-generated molecules have comparable chemical property profiles to known bioactives, and exhibit greater diversity and higher scores over an industry-leading generative approach.
arXiv Detail & Related papers (2025-05-19T09:48:02Z) - Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization [51.104444856052204]
We present MultiMol, a collaborative large language model (LLM) system designed to guide multi-objective molecular optimization.<n>In evaluations across six multi-objective optimization tasks, MultiMol significantly outperforms existing methods, achieving a 82.30% success rate.
arXiv Detail & Related papers (2025-03-05T13:47:55Z) - InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization [77.79862482208326]
InversionGNN is an effective yet sample-efficient dual-path graph neural network (GNN) for multi-objective drug discovery.<n>We train the model for multi-property prediction to acquire knowledge of the optimal combination of functional groups.<n>Then the learned chemical knowledge helps the inversion generation path to generate molecules with required properties.
arXiv Detail & Related papers (2025-03-03T12:53:36Z) - Optimized Drug Design using Multi-Objective Evolutionary Algorithms with SELFIES [1.124958340749622]
We deploy multi-objective evolutionary algorithms, namely NSGA-II, NSGA-III, and MOEA/D, for this purpose.
In addition to the QED and SA score, we optimize compounds using the GuacaMol benchmark multi-objective task sets.
arXiv Detail & Related papers (2024-05-01T09:06:30Z) - Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine
Learning [54.247560894146105]
Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria.
We propose to use an unsupervised machine learning model known as the Potts model to discover new, useful sequences with controllable sequence diversity.
arXiv Detail & Related papers (2022-08-10T13:30:58Z) - Accelerating Inhibitor Discovery for Multiple SARS-CoV-2 Targets with a
Single, Sequence-Guided Deep Generative Framework [47.14853881703749]
We demonstrate the broad utility of a single deep generative framework toward discovering novel drug-like inhibitor molecules.
To perform target-aware design, the framework employs a target sequence-conditioned sampling of novel molecules from a generative model.
The most potent spike RBD inhibitor also emerged as a rare non-covalent antiviral with broad-spectrum activity against several SARS-CoV-2 variants.
arXiv Detail & Related papers (2022-04-19T17:59:46Z) - Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug
Discovery [4.905176604265767]
Distilled Graph Attention Policy Networks (DGAPNs) generate novel graph-structured chemical representations.
We present a spatial Graph Attention Network (sGAT) that leverages self-attention over both node and edge attributes as well as encoding spatial structure.
In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-06-04T00:36:47Z) - Scaffold-constrained molecular generation [0.0]
We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation.
We showcase the method's ability to perform scaffold-constrained generation on various tasks.
arXiv Detail & Related papers (2020-09-15T15:41:18Z) - Accelerating Antimicrobial Discovery with Controllable Deep Generative
Models and Molecular Dynamics [109.70543391923344]
CLaSS (Controlled Latent attribute Space Sampling) is an efficient computational method for attribute-controlled generation of molecules.
We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations.
The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency.
arXiv Detail & Related papers (2020-05-22T15:57:58Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.