Related papers: SOLD: SELFIES-based Objective-driven Latent Diffusion

SOLD: SELFIES-based Objective-driven Latent Diffusion

URL: http://arxiv.org/abs/2509.25198v1
Date: Wed, 03 Sep 2025 18:10:23 GMT
Title: SOLD: SELFIES-based Objective-driven Latent Diffusion
Authors: Elbert Ho,
Abstract summary: We propose a novel latent diffusion model that generates molecules in a latent space derived from 1D SELFIES strings and conditioned on a target protein.<n>Our model generates high-affinity molecules for the target protein in a simple and efficient way, while also leaving room for future improvements through the addition of more data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, machine learning has made a significant impact on de novo drug design. However, current approaches to creating novel molecules conditioned on a target protein typically rely on generating molecules directly in the 3D conformational space, which are often slow and overly complex. In this work, we propose SOLD (SELFIES-based Objective-driven Latent Diffusion), a novel latent diffusion model that generates molecules in a latent space derived from 1D SELFIES strings and conditioned on a target protein. In the process, we also train an innovative SELFIES transformer and propose a new way to balance losses when training multi-task machine learning models.Our model generates high-affinity molecules for the target protein in a simple and efficient way, while also leaving room for future improvements through the addition of more data.

Related papers

Guiding Diffusion Models with Reinforcement Learning for Stable Molecule Generation [16.01877423456416]
Reinforcement Learning with Physical Feedback (RLPF) is a novel framework that extends Denoising Diffusion Policy Optimization to 3D molecular generation.<n>RLPF introduces reward functions derived from force-field evaluations to guide the generation toward energetically stable and physically meaningful structures.<n> Experiments on the QM9 and GEOM-drug datasets demonstrate that RLPF significantly improves molecular stability compared to existing methods.
arXiv Detail & Related papers (2025-08-22T16:44:55Z)
Molecule Generation for Target Protein Binding with Hierarchical Consistency Diffusion Model [17.885767456439215]
Atom-Motif Consistency Diffusion Model (AMDiff) is a hierarchical diffusion architecture that integrates both atom- and motif-level views of molecules.<n>Compared to existing approaches, AMDiff exhibits superior validity and novelty in generating molecules tailored to fit various protein pockets.
arXiv Detail & Related papers (2025-03-02T17:54:30Z)
Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models. It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z)
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties. It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z)
LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models [55.5427001668863]
We present a novel latent diffusion model dubbed LDMol for text-conditioned molecule generation.<n> Experiments show that LDMol outperforms the existing autoregressive baselines on the text-to-molecule generation benchmark.<n>We show that LDMol can be applied to downstream tasks such as molecule-to-text retrieval and text-guided molecule editing.
arXiv Detail & Related papers (2024-05-28T04:59:13Z)
AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design [16.946648071157618]
We propose a diffusion-based fragment-wise autoregressive generation model for structure-based drug design (SBDD) We design a novel molecule assembly strategy named conformal motif that preserves the conformation of local structures of molecules first. We then encode the interaction of the protein-ligand complex with an SE(3)-equivariant convolutional network and generate molecules motif-by-motif with diffusion modeling.
arXiv Detail & Related papers (2024-04-02T14:44:02Z)
Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task. We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z)
DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally. We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold. Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z)
Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation [1.3124513975412255]
Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery. We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas. We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
arXiv Detail & Related papers (2023-09-29T14:53:05Z)
SILVR: Guided Diffusion for Molecule Generation [0.0]
We introduce a machine-learning method for conditioning an existing generative model without retraining. The model allows the generation of new molecules that fit into a binding site of a protein based on fragment hits. We show that moderate SILVR rates make it possible to generate new molecules of similar shape to the original fragments.
arXiv Detail & Related papers (2023-04-21T11:47:38Z)
Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation. We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.