Related papers: Gradient-based Discrete Sampling with Automatic Cyclical Scheduling

Gradient-based Discrete Sampling with Automatic Cyclical Scheduling

URL: http://arxiv.org/abs/2402.17699v2
Date: Thu, 24 Oct 2024 18:01:24 GMT
Title: Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Authors: Patrick Pynadath, Riddhiman Bhattacharya, Arun Hariharan, Ruqi Zhang,
Abstract summary: We propose an automatic cyclical scheduling for efficient and accurate sampling in multimodal discrete distributions. We prove the non-asymptotic convergence and inference guarantee for our method in general discrete distributions.
Score: 8.017203108408973
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Discrete distributions, particularly in high-dimensional deep models, are often highly multimodal due to inherent discontinuities. While gradient-based discrete sampling has proven effective, it is susceptible to becoming trapped in local modes due to the gradient information. To tackle this challenge, we propose an automatic cyclical scheduling, designed for efficient and accurate sampling in multimodal discrete distributions. Our method contains three key components: (1) a cyclical step size schedule where large steps discover new modes and small steps exploit each mode; (2) a cyclical balancing schedule, ensuring "balanced" proposals for given step sizes and high efficiency of the Markov chain; and (3) an automatic tuning scheme for adjusting the hyperparameters in the cyclical schedules, allowing adaptability across diverse datasets with minimal tuning. We prove the non-asymptotic convergence and inference guarantee for our method in general discrete distributions. Extensive experiments demonstrate the superiority of our method in sampling complex multimodal discrete distributions.

Related papers

FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z)
Entropy-Guided Sampling of Flat Modes in Discrete Spaces [9.099589602551573]
Existing sampling algorithms often overlook the mode volume and struggle to capture flat modes effectively.<n>We propose emphEntropic Discrete Langevin Proposal (EDLP), which incorporates local entropy into the sampling process through a continuous auxiliary variable under a joint distribution.<n>We provide non-asymptotic convergence guarantees for EDLP in locally log-concave discrete distributions.
arXiv Detail & Related papers (2025-05-05T00:12:34Z)
Quantizing Diffusion Models from a Sampling-Aware Perspective [43.95032520555463]
We propose a sampling-aware quantization strategy, wherein a Mixed-Order Trajectory Alignment technique is devised.<n>Experiments on sparse-step fast sampling across multiple datasets demonstrate that our approach preserves the rapid convergence characteristics of high-speed samplers.
arXiv Detail & Related papers (2025-05-04T20:50:44Z)
Enhancing Gradient-based Discrete Sampling via Parallel Tempering [8.195708231156546]
A gradient-based discrete sampler is susceptible to getting trapped in local minima in high-dimensional, multimodal discrete distributions. We develop a discrete Langevin proposal that combines parallel tempering, also known as replica exchange, with the discrete Langevin proposal. We show that our algorithm converges non-asymptotically to the target energy and exhibits faster mixing compared to a single chain.
arXiv Detail & Related papers (2025-02-26T15:51:15Z)
SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems [14.2814208019426]
Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. DMs are typically modified to approximately sample from a measurement-conditioned distribution in the image space. These modifications may be unsuitable for certain settings (such as in the presence of measurement noise) and non-linear tasks. We state three conditions for achieving measurement-consistent diffusion trajectories.
arXiv Detail & Related papers (2024-10-06T13:39:36Z)
Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density [70.14884528360199]
We introduce an approach to bias deep generative models, such as GANs and diffusion models, towards generating data with enhanced fidelity or increased diversity. Our approach involves manipulating the distribution of training and generated data through a novel metric for individual samples, named pseudo density.
arXiv Detail & Related papers (2024-07-11T16:46:04Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models [63.927438959502226]
Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond. A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks. We propose a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs.
arXiv Detail & Related papers (2024-04-22T18:18:41Z)
Space-Time Diffusion Bridge [0.4527270266697462]
We introduce a novel method for generating new synthetic samples independent and identically distributed from real probability distributions. We use space-time mixing strategies that extend across temporal and spatial dimensions. We validate the efficacy of our space-time diffusion approach with numerical experiments.
arXiv Detail & Related papers (2024-02-13T23:26:11Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems [25.077446336619378]
We propose a regularity regime which endows the gradient method with the same worst-case complexity as the gradient method. All existing guarantees require the gradient method to take small steps, thereby resulting in a much slower linear rate of convergence. We demonstrate that our condition holds when training sufficiently wide feedforward neural networks with a linear output layer.
arXiv Detail & Related papers (2023-06-05T05:21:01Z)
AutoSampling: Search for Effective Data Sampling Schedules [118.20014773014671]
We propose an AutoSampling method to automatically learn sampling schedules for model training. We apply our method to a variety of image classification tasks illustrating the effectiveness of the proposed method.
arXiv Detail & Related papers (2021-05-28T09:39:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.