Multi-scale Autoregressive Models are Laplacian, Discrete, and Latent Diffusion Models in Disguise
- URL: http://arxiv.org/abs/2510.02826v1
- Date: Fri, 03 Oct 2025 09:05:38 GMT
- Title: Multi-scale Autoregressive Models are Laplacian, Discrete, and Latent Diffusion Models in Disguise
- Authors: Steve Hong, Samuel Belkadi,
- Abstract summary: We revisit Visual Autoregressive models through the lens of an iterative-refinement framework.<n>We formalise it as a deterministic forward process that constructs a Laplacian-style latent pyramid, paired with a learned backward process that reconstructs it in a small number of coarse-to-fine steps.
- Score: 0.6875312133832079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We revisit Visual Autoregressive (VAR) models through the lens of an iterative-refinement framework. Rather than viewing VAR solely as next-scale autoregression, we formalise it as a deterministic forward process that constructs a Laplacian-style latent pyramid, paired with a learned backward process that reconstructs it in a small number of coarse-to-fine steps. This view connects VAR to denoising diffusion and isolates three design choices that help explain its efficiency and fidelity: refining in a learned latent space, casting prediction as discrete classification over code indices, and partitioning the task by spatial frequency. We run controlled experiments to quantify each factor's contribution to fidelity and speed, and we outline how the same framework extends to permutation-invariant graph generation and to probabilistic, ensemble-style medium-range weather forecasting. The framework also suggests practical interfaces for VAR to leverage tools from the diffusion ecosystem while retaining few-step, scale-parallel generation.
Related papers
- Beyond Parameter Arithmetic: Sparse Complementary Fusion for Distribution-Aware Model Merging [20.429700094073684]
We propose Sparse Complementary Fusion with reverse KL (SCF-RKL), a novel model merging framework that explicitly controls functional interference through sparse, distribution-aware updates.<n>We evaluate SCF-RKL across a wide range of model scales and architectures, covering both reasoning-focused and instruction-tuned models.
arXiv Detail & Related papers (2026-02-12T08:45:42Z) - Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory [30.327899232038863]
textbfHierarchical Koopman Diffusion is a novel framework that achieves both one-step sampling and interpretable generative trajectories.<n>Our framework bridges the gap between fast sampling and interpretability in diffusion models, paving the way for explainable image synthesis in generative modeling.
arXiv Detail & Related papers (2025-10-14T07:17:35Z) - Scale-Wise VAR is Secretly Discrete Diffusion [48.994983608261286]
Next scale prediction Visual Autoregressive Generation ( VAR) has recently demonstrated remarkable performance, even surpassing diffusion-based models.<n>In this work, we revisit VAR and uncover a theoretical insight: when equipped with a Markovian attention mask, VAR is mathematically equivalent to a discrete diffusion.<n>We show how one can directly import the advantages of diffusion such as iterative refinement and reduce architectural inefficiencies into VAR, yielding faster convergence, lower inference cost, and improved zero-shot reconstruction.
arXiv Detail & Related papers (2025-09-26T17:58:04Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Time Series Forecasting Using Manifold Learning [6.316185724124034]
We address a three-tier numerical framework based on manifold learning for the forecasting of high-dimensional time series.
At the first step, we embed the time series into a reduced low-dimensional space using a nonlinear manifold learning algorithm.
At the second step, we construct reduced-order regression models on the manifold to forecast the embedded dynamics.
At the final step, we lift the embedded time series back to the original high-dimensional space.
arXiv Detail & Related papers (2021-10-07T17:09:59Z) - A Variational Perspective on Diffusion-Based Generative Models and Score
Matching [8.93483643820767]
We derive a variational framework for likelihood estimation for continuous-time generative diffusion.
We show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE.
arXiv Detail & Related papers (2021-06-05T05:50:36Z) - Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents.
We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings.
We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z) - Self-Reflective Variational Autoencoder [21.054722609128525]
Variational Autoencoder (VAE) is a powerful framework for learning latent variable generative models.
We introduce a solution, which we call self-reflective inference.
We empirically demonstrate the clear advantages of matching the variational posterior to the exact posterior.
arXiv Detail & Related papers (2020-07-10T05:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.