Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion
- URL: http://arxiv.org/abs/2510.04525v1
- Date: Mon, 06 Oct 2025 06:30:22 GMT
- Title: Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion
- Authors: Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji,
- Abstract summary: Masked diffusion models have shown promising performance in generating high-quality samples in a wide range of domains.<n>This paper theoretically analyzes the MaskGIT sampler for image modeling, revealing its implicit temperature sampling mechanism.<n>We introduce the "moment sampler," which employs a "choose-then-sample" approach by selecting unmasking positions before sampling tokens.
- Score: 41.409281069230325
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Masked diffusion models have shown promising performance in generating high-quality samples in a wide range of domains, but accelerating their sampling process remains relatively underexplored. To investigate efficient samplers for masked diffusion, this paper theoretically analyzes the MaskGIT sampler for image modeling, revealing its implicit temperature sampling mechanism. Through this analysis, we introduce the "moment sampler," an asymptotically equivalent but more tractable and interpretable alternative to MaskGIT, which employs a "choose-then-sample" approach by selecting unmasking positions before sampling tokens. In addition, we improve the efficiency of choose-then-sample algorithms through two key innovations: a partial caching technique for transformers that approximates longer sampling trajectories without proportional computational cost, and a hybrid approach formalizing the exploration-exploitation trade-off in adaptive unmasking. Experiments in image and text domains demonstrate our theory as well as the efficiency of our proposed methods, advancing both theoretical understanding and practical implementation of masked diffusion samplers.
Related papers
- Self-Speculative Masked Diffusions [46.04054227238148]
We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data.<n>We reduce the computational burden by generating non-factorized predictions over masked positions.<n>We apply our method to GPT2 scale text modelling and protein sequences generation, finding that we can achieve a 2x reduction in the required number of network forward passes.
arXiv Detail & Related papers (2025-10-04T20:16:38Z) - Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking [17.511240770486452]
Masked diffusion models (MDMs) have shown competitive performance compared to autoregressive models (ARMs) for language modeling.<n>We introduce EB-Sampler, a drop-in replacement for existing samplers, utilizing an Entropy Bounded unmasking procedure.<n> EB-Sampler accelerates sampling from current state of the art MDMs by roughly 2-3x on standard coding and math reasoning benchmarks without loss in performance.
arXiv Detail & Related papers (2025-05-30T17:52:55Z) - Diffusing Differentiable Representations [60.72992910766525]
We introduce a novel, training-free method for sampling differentiable representations (diffreps) using pretrained diffusion models.<n>We identify an implicit constraint on the samples induced by the diffrep and demonstrate that addressing this constraint significantly improves the consistency and detail of the generated objects.
arXiv Detail & Related papers (2024-12-09T20:42:58Z) - Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance [25.41734642338575]
Masked generative models (MGMs) have shown impressive generative ability while providing an order of magnitude efficient sampling steps.
We propose a self-guidance sampling method, which leads to better generation quality.
arXiv Detail & Related papers (2024-10-17T01:48:05Z) - Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.<n>We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.<n>Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z) - Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation [49.827306773992376]
Continual Test-Time Adaptation (CTTA) is proposed to migrate a source pre-trained model to continually changing target distributions.
Our proposed method attains state-of-the-art performance in both classification and segmentation CTTA tasks.
arXiv Detail & Related papers (2023-12-19T15:34:52Z) - Sampling From Autoencoders' Latent Space via Quantization And
Probability Mass Function Concepts [1.534667887016089]
We introduce a novel post-training sampling algorithm rooted in the concept of probability mass functions, coupled with a quantization process.
Our proposed algorithm establishes a vicinity around each latent vector from the input data and then proceeds to draw samples from these defined neighborhoods.
This strategic approach ensures that the sampled latent vectors predominantly inhabit high-probability regions, which, in turn, can be effectively transformed into authentic real-world images.
arXiv Detail & Related papers (2023-08-21T13:18:12Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Constrained Probabilistic Mask Learning for Task-specific Undersampled
MRI Reconstruction [8.44194619347218]
Undersampling is a common method in Magnetic Resonance Imaging (MRI) to subsample the number of data points in k-space.
We propose a method that directly learns the undersampling masks from data points.
We show that different anatomic regions reveal distinct optimal undersampling masks.
arXiv Detail & Related papers (2023-05-25T14:42:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.