Related papers: Discrete Adjoint Matching

Discrete Adjoint Matching

URL: http://arxiv.org/abs/2602.07132v1
Date: Fri, 06 Feb 2026 19:12:40 GMT
Title: Discrete Adjoint Matching
Authors: Oswin So, Brian Karrer, Chuchu Fan, Ricky T. Q. Chen, Guan-Horng Liu,
Abstract summary: We propose a discrete variant of Adjoint Matching (AM) for fine-tuning discrete generative models.<n>The core of DAM is the introduction of discrete adjoint-an estimator of the optimal solution to the original problem.<n>We showcase DAM's effectiveness on synthetic and mathematical reasoning tasks.
Score: 43.00097192213681
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computation methods for solving entropy-regularized reward optimization -- a class of problems widely used for fine-tuning generative models -- have advanced rapidly. Among those, Adjoint Matching (AM, Domingo-Enrich et al., 2025) has proven highly effective in continuous state spaces with differentiable rewards. Transferring these practical successes to discrete generative modeling, however, remains particularly challenging and largely unexplored, mainly due to the drastic shift in generative model classes to discrete state spaces, which are nowhere differentiable. In this work, we propose Discrete Adjoint Matching (DAM) -- a discrete variant of AM for fine-tuning discrete generative models characterized by Continuous-Time Markov Chains, such as diffusion-based large language models. The core of DAM is the introduction of discrete adjoint-an estimator of the optimal solution to the original problem but formulated on discrete domains-from which standard matching frameworks can be applied. This is derived via a purely statistical standpoint, in contrast to the control-theoretic viewpoint in AM, thereby opening up new algorithmic opportunities for general adjoint-based estimators. We showcase DAM's effectiveness on synthetic and mathematical reasoning tasks.

Related papers

Non-Asymptotic Convergence of Discrete Diffusion Models: Masked and Random Walk dynamics [13.202844408027412]
We develop new and sharp convergence guarantees for three popular discrete diffusion models.<n>We show that the computational complexity of each method scales linearly in the dimension, up to logarithmic factors.<n>This study provides the first non-asymptotic convergence guarantees for these noising processes.
arXiv Detail & Related papers (2025-11-29T18:24:43Z)
Composition and Alignment of Diffusion Models using Constrained Learning [79.36736636241564]
Diffusion models have become prevalent in generative modeling due to their ability to sample from complex distributions.<n>Two commonly used methods are: (i) alignment, which involves fine-tuning a diffusion model to align it with a reward; and (ii) composition, which combines several pre-trained diffusion models, each emphasizing a desirable attribute in the generated outputs.<n>We propose a constrained optimization framework that unifies alignment and composition of diffusion models by enforcing that the aligned model satisfies reward constraints and/or remains close to (potentially multiple) pre-trained models.
arXiv Detail & Related papers (2025-08-26T15:06:30Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various preconditions.<n>It converges under the sole assumption of Lipschitz continuity of the gradient on a bounded region, removing the need for other conditions commonly imposed by methods.<n>It demonstrates its superior numerical performance compared to various state-of-the-art iterations.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
Multi-Agent Path Finding in Continuous Spaces with Projected Diffusion Models [57.45019514036948]
Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics.<n>This work proposes a novel approach that integrates constrained optimization with diffusion models for MAPF in continuous spaces.
arXiv Detail & Related papers (2024-12-23T21:27:19Z)
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.<n>We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.<n>Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z)
DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Generative diffusion models are popular in various cross-domain applications.<n>These models hold promise in tackling complex network optimization problems.<n>We propose a new framework for generative diffusion models called Diffusion Model-based Solution Generation.
arXiv Detail & Related papers (2024-08-13T07:56:21Z)
Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z)
ClusterDDPM: An EM clustering framework with Denoising Diffusion Probabilistic Models [9.91610928326645]
Denoising diffusion probabilistic models (DDPMs) represent a new and promising class of generative models. In this study, we introduce an innovative expectation-maximization (EM) framework for clustering using DDPMs. In the M-step, our focus lies in learning clustering-friendly latent representations for the data by employing the conditional DDPM and matching the distribution of latent representations to the mixture of Gaussian priors.
arXiv Detail & Related papers (2023-12-13T10:04:06Z)
Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models. Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z)
Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements [19.524063429548278]
Extragradient (SEG) and Gradient Descent Ascent (SGDA) are preeminent algorithms for min-max optimization and variational inequalities problems. Our work endeavors to quantify and quantify the intrinsic structures intrinsic to these algorithms. By recasting the constant step-size SEG/SGDA as time-homogeneous Markov Chains, we establish a first-of-its-kind Law of Large Numbers and a Central Limit Theorem.
arXiv Detail & Related papers (2023-06-28T18:50:07Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.