Related papers: A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

URL: http://arxiv.org/abs/2406.01661v2
Date: Thu, 8 Aug 2024 12:17:56 GMT
Title: A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
Authors: Sebastian Sanokowski, Sepp Hochreiter, Sebastian Lehner,
Abstract summary: Current deep learning approaches rely on generative models that yield exact sample likelihoods. This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models. We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.
Score: 7.378582040635655
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to sample from intractable distributions over discrete sets without relying on corresponding training data is a central problem in a wide range of fields, including Combinatorial Optimization. Currently, popular deep learning-based approaches rely primarily on generative models that yield exact sample likelihoods. This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models like diffusion models. Our approach is conceptually based on a loss that upper bounds the reverse Kullback-Leibler divergence and evades the requirement of exact sample likelihoods. We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.

Related papers

Bridge Matching Sampler: Scalable Sampling via Generalized Fixed-Point Diffusion Matching [38.70740405520393]
Bridge Matching Sampler (BMS) enables learning a transport map between arbitrary prior and target distributions with a single, scalable, and stable objective.<n>We demonstrate that our method enables sampling at unprecedented scales while preserving mode diversity, achieving state-of-the-art results on complex synthetic densities and high-dimensional molecular benchmarks.
arXiv Detail & Related papers (2026-02-28T08:00:38Z)
Learnable Chernoff Baselines for Inference-Time Alignment [64.81256817158851]
We introduce Learnable Chernoff Baselines as a method for efficiently and approximately sampling from exponentially tilted kernels.<n>We establish total-variation guarantees to the ideal aligned model, and demonstrate in both continuous and discrete diffusion settings that LCB sampling closely matches ideal rejection sampling.
arXiv Detail & Related papers (2026-02-08T00:09:40Z)
Reverse-BSDE Monte Carlo [1.8749305679160366]
We reformulate equations governing diffusion-based generative models as a Forward-Backward Differential Equation (FBSDE)<n>We propose a numerical solution to this problem, leveraging on Deep Learning techniques.
arXiv Detail & Related papers (2025-05-11T00:42:07Z)
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics [7.873510219469276]
We introduce two novel training methods for discrete diffusion samplers. These methods yield memory-efficient training and achieve state-of-the-art results in unsupervised optimization. We introduce adaptations of SN-NIS and Neural Chain Monte Carlo that enable for the first time the application of discrete diffusion models to this problem.
arXiv Detail & Related papers (2025-02-12T18:59:55Z)
Training-free Diffusion Model Alignment with Sampling Demons [15.400553977713914]
We propose an optimization approach, dubbed Demon, to guide the denoising process at inference time without backpropagation through reward functions or model retraining. Our approach works by controlling noise distribution in denoising steps to concentrate density on regions corresponding to high rewards through optimization. To the best of our knowledge, the proposed approach is the first inference-time, backpropagation-free preference alignment method for diffusion models.
arXiv Detail & Related papers (2024-10-08T07:33:49Z)
Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues. We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution. Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z)
Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints [55.39203337683045]
We propose to perform optimization within the data manifold using diffusion models.<n>Depending on the differentiability of the objective function, we propose two different sampling methods.<n>Our method achieves better or comparable performance with previous state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-28T03:09:12Z)
Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods. Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z)
Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors. Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM) Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain. We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z)
Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data. Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z)
Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution. We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.