SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups
- URL: http://arxiv.org/abs/2410.02942v1
- Date: Thu, 3 Oct 2024 19:37:40 GMT
- Title: SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups
- Authors: Yongxing Zhang, Donglin Yang, Renjie Liao,
- Abstract summary: We introduce a novel discrete diffusion model that simplifies the task of learning a complicated distribution over $S_n$.
Our model achieves state-of-the-art or comparable performances on solving tasks including sorting 4-digit MNIST images.
- Score: 14.925722398371498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finite symmetric groups $S_n$ are essential in fields such as combinatorics, physics, and chemistry. However, learning a probability distribution over $S_n$ poses significant challenges due to its intractable size and discrete nature. In this paper, we introduce SymmetricDiffusers, a novel discrete diffusion model that simplifies the task of learning a complicated distribution over $S_n$ by decomposing it into learning simpler transitions of the reverse diffusion using deep neural networks. We identify the riffle shuffle as an effective forward transition and provide empirical guidelines for selecting the diffusion length based on the theory of random walks on finite groups. Additionally, we propose a generalized Plackett-Luce (PL) distribution for the reverse transition, which is provably more expressive than the PL distribution. We further introduce a theoretically grounded "denoising schedule" to improve sampling and learning efficiency. Extensive experiments show that our model achieves state-of-the-art or comparable performances on solving tasks including sorting 4-digit MNIST images, jigsaw puzzles, and traveling salesman problems. Our code is released at https://github.com/NickZhang53/SymmetricDiffusers.
Related papers
- Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV)
We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z) - Discrete generative diffusion models without stochastic differential equations: a tensor network approach [1.5839621757142595]
Diffusion models (DMs) are a class of generative machine learning methods.
We show how to use networks (TNs) to efficiently define and sample such discrete models''
arXiv Detail & Related papers (2024-07-15T18:00:11Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.
We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.
We make no separation assumptions on the underlying mixture components.
We give an algorithm that draws $dmathrmpoly(k/varepsilon)$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler.
arXiv Detail & Related papers (2024-04-29T17:30:36Z) - Learning Mixtures of Gaussians Using Diffusion Models [9.118706387430883]
We give a new algorithm for learning mixtures of $k$ Gaussians to TV error.
Our approach is analytic and relies on the framework of diffusion models.
arXiv Detail & Related papers (2024-04-29T17:00:20Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - SwinGNN: Rethinking Permutation Invariance in Diffusion Models for Graph Generation [15.977241867213516]
Diffusion models based on permutation-equivariant networks can learn permutation-invariant distributions for graph data.
We propose a non-invariant diffusion model, called $textitSwinGNN$, which employs an efficient edge-to-edge 2-WL message passing network.
arXiv Detail & Related papers (2023-07-04T10:58:42Z) - Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion
Models [54.1843419649895]
We propose a solution based on denoising diffusion probabilistic models (DDPMs)
Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models.
Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task.
arXiv Detail & Related papers (2022-12-01T18:59:55Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$.
The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.