Related papers: Categorical Reparameterization with Denoising Diffusion models

Categorical Reparameterization with Denoising Diffusion models

URL: http://arxiv.org/abs/2601.00781v1
Date: Fri, 02 Jan 2026 18:30:05 GMT
Title: Categorical Reparameterization with Denoising Diffusion models
Authors: Samson Gourevitch, Alain Durmus, Eric Moulines, Jimmy Olsson, Yazid Janati,
Abstract summary: We introduce a diffusion-based soft re parameterization for categorical distributions.<n>Our experiments show that the proposed re parameterization trick yields competitive or improved optimization performance on various benchmarks.
Score: 33.643089978457155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient-based optimization with categorical variables typically relies on score-function estimators, which are unbiased but noisy, or on continuous relaxations that replace the discrete distribution with a smooth surrogate admitting a pathwise (reparameterized) gradient, at the cost of optimizing a biased, temperature-dependent objective. In this paper, we extend this family of relaxations by introducing a diffusion-based soft reparameterization for categorical distributions. For these distributions, the denoiser under a Gaussian noising process admits a closed form and can be computed efficiently, yielding a training-free diffusion sampler through which we can backpropagate. Our experiments show that the proposed reparameterization trick yields competitive or improved optimization performance on various benchmarks.

Related papers

On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization [57.179679246370114]
A potential limitation of existing methods is the bias inherent in most perturbation estimators unless a stepsize is proposed.<n>We propose a novel family of unbiased gradient scaling estimators that eliminate bias while maintaining favorable construction.
arXiv Detail & Related papers (2025-10-22T18:25:43Z)
Optimization-Free Diffusion Model -- A Perturbation Theory Approach [12.756355928431455]
Diffusion models have emerged as a powerful framework in generative modeling.<n>We propose an alternative method that is both optimization-free and forward SDE-free.<n>We demonstrate the effectiveness of our method on high-dimensional Boltzmann distributions and real-world datasets.
arXiv Detail & Related papers (2025-05-29T17:02:26Z)
Implicit Diffusion: Efficient Optimization through Stochastic Sampling [46.049117719591635]
We present a new algorithm to optimize distributions defined implicitly by parameterized diffusions.<n>We introduce a general framework for first-order optimization of these processes, that performs jointly.<n>We apply it to training energy-based models and finetuning denoising diffusions.
arXiv Detail & Related papers (2024-02-08T08:00:11Z)
Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers. We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z)
Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain. We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions. We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z)
ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence [17.665255113864795]
We present a novel divergence-like metric which corresponds to the upper bound of the Kullback-Leibler divergence (KLD) of a relaxed categorical distribution. We also propose a relaxed categorical analytic bound variational autoencoder (ReCAB-VAE) that successfully models both continuous and relaxed latent representations.
arXiv Detail & Related papers (2022-05-09T08:11:46Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Reliable Categorical Variational Inference with Mixture of Discrete Normalizing Flows [10.406659081400354]
Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling. Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations. In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one.
arXiv Detail & Related papers (2020-06-28T10:39:39Z)
Generalized Gumbel-Softmax Gradient Estimator for Various Discrete Random Variables [16.643346012854156]
Esting the gradients of nodes is one of the crucial research questions in the deep generative modeling community. This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation.
arXiv Detail & Related papers (2020-03-04T01:13:15Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.