Related papers: Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

URL: http://arxiv.org/abs/2410.08134v1
Date: Thu, 10 Oct 2024 17:18:30 GMT
Title: Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Authors: Jarrid Rector-Brooks, Mohsin Hasan, Zhangzhi Peng, Zachary Quinn, Chenghao Liu, Sarthak Mittal, Nouha Dziri, Michael Bronstein, Yoshua Bengio, Pranam Chatterjee, Alexander Tong, Avishek Joey Bose,
Abstract summary: We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference. Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
Score: 88.65168366064061
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative modeling of discrete data underlies important applications spanning text-based agents like ChatGPT to the design of the very building blocks of life in protein sequences. However, application domains need to exert control over the generated data by steering the generative process - typically via RLHF - to satisfy a specified property, reward, or affinity metric. In this paper, we study the problem of steering Masked Diffusion Models (MDMs), a recent class of discrete diffusion models that offer a compelling alternative to traditional autoregressive models. We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference by learning to sample from a target Bayesian posterior. Our DDPP framework leads to a family of three novel objectives that are all simulation-free, and thus scalable while applying to general non-differentiable reward functions. Empirically, we instantiate DDPP by steering MDMs to perform class-conditional pixel-level image modeling, RLHF-based alignment of MDMs using text-based rewards, and finetuning protein language models to generate more diverse secondary structures and shorter proteins. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.

Related papers

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling [37.795834398730555]
Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language.<n>We propose a verifier-based inference-time scaling method that aids in finding a better candidate generation during the denoising process of the MDM.<n>Our experiments demonstrate the application of MDMs for standard text-style transfer tasks and establish MDMs as a better alternative to autoregressive language models.
arXiv Detail & Related papers (2025-08-14T18:01:22Z)
Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling [48.96034602889216]
Variencoding Discrete Diffusion (VADD) is a novel framework that enhances discrete diffusion with latent variable modeling.<n>By introducing an auxiliary recognition model, VADD enables stable training via variational lower bounds and amortized inference over the training set.<n> Empirical results on 2D toy data, pixel-level image generation, and text generation demonstrate that VADD consistently outperforms MDM baselines.
arXiv Detail & Related papers (2025-05-23T01:45:47Z)
Unifying Autoregressive and Diffusion-Based Sequence Generation [2.3923884480793673]
We present extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. We introduce hyperschedules, which assign distinct noise schedules to individual token positions. Second, we propose two hybrid token-wise noising processes that interpolate between absorbing and uniform processes, enabling the model to fix past mistakes.
arXiv Detail & Related papers (2025-04-08T20:32:10Z)
Single Domain Generalization with Model-aware Parametric Batch-wise Mixup [22.709796153794507]
Single Domain Generalization remains a formidable challenge in the field of machine learning. We propose a novel data augmentation approach, named as Model-aware Parametric Batch-wise Mixup. By exploiting inter-feature correlations, the parameterized mixup generator introduces additional versatility in combining features across a batch of instances.
arXiv Detail & Related papers (2025-02-22T03:45:18Z)
Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence [11.400431211239958]
Diffusion models have emerged as powerful tools for generative modeling. We propose a control framework for fine-tuning diffusion models. We show that PI-FT achieves global convergence at a linear rate.
arXiv Detail & Related papers (2024-12-24T04:55:46Z)
[MASK] is All You Need [28.90875822599164]
We propose using discrete-state models to connect Masked Generative and Non-autoregressive Diffusion models. By leveraging [MASK] in discrete-state models, we can bridge Masked Generative and Non-autoregressive Diffusion models.
arXiv Detail & Related papers (2024-12-09T18:59:56Z)
Plug-and-Play Controllable Generation for Discrete Masked Models [27.416952690340903]
This article makes discrete masked models for the generative modeling of discrete data controllable. We propose a novel plug-and-play framework based on importance sampling that bypasses the need for training a conditional score. Our framework is agnostic to the choice of control criteria, requires no gradient information, and is well-suited for tasks such as posterior sampling, Bayesian inverse problems, and constrained generation.
arXiv Detail & Related papers (2024-10-03T02:00:40Z)
Aggregation of Multi Diffusion Models for Enhancing Learned Representations [4.126721111013567]
This paper introduces a novel algorithm, Aggregation of Multi Diffusion Models (AMDM) AMDM synthesizes features from multiple diffusion models into a specified model, enhancing its learned representations to activate specific features for fine-grained control. Experimental results demonstrate that AMDM significantly improves fine-grained control without additional training or inference time.
arXiv Detail & Related papers (2024-10-02T06:16:06Z)
Is Tokenization Needed for Masked Particle Modelling? [8.79008927474707]
Masked particle modeling (MPM) is a self-supervised learning scheme for constructing expressive representations of unordered sets. We improve MPM by addressing inefficiencies in the implementation and incorporating a more powerful decoder. We show that these new methods outperform the tokenized learning objective from the original MPM on a new test bed for foundation models for jets.
arXiv Detail & Related papers (2024-09-19T09:12:29Z)
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences. We aim to optimize downstream reward functions while preserving the naturalness of these design spaces. Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models [17.124075103464392]
Diffusion models (DPMs) have become the state-of-the-art in high-quality image generation. DPMs have an arbitrary noisy latent space with no interpretable or controllable semantics. We propose CausalDiffAE, a diffusion-based causal representation learning framework to enable counterfactual generation.
arXiv Detail & Related papers (2024-04-27T00:09:26Z)
AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z)
Insights into Closed-form IPM-GAN Discriminator Guidance for Diffusion Modeling [11.68361062474064]
We propose a theoretical framework to analyze the effect of the GAN discriminator on Langevin-based sampling.<n>We show that the proposed approach can be combined with existing accelerated-diffusion techniques to improve latent-space image generation.
arXiv Detail & Related papers (2023-06-02T16:24:07Z)
Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
Towards Controllable Diffusion Models via Reward-Guided Exploration [15.857464051475294]
We propose a novel framework that guides the training-phase of diffusion models via reinforcement learning (RL) RL enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves. Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.
arXiv Detail & Related papers (2023-04-14T13:51:26Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.