Progressive Checkerboards for Autoregressive Multiscale Image Generation
- URL: http://arxiv.org/abs/2602.03811v1
- Date: Tue, 03 Feb 2026 18:15:27 GMT
- Title: Progressive Checkerboards for Autoregressive Multiscale Image Generation
- Authors: David Eigen,
- Abstract summary: A key challenge in autoregressive image generation is to efficiently sample independent locations in parallel.<n>In this work we examine a flexible, fixed ordering based on progressive checkerboards for multiscale autoregressive image generation.<n>We find evidence that in our balanced setting, a wide range of scale-up factors lead to similar results, so long as the total number of serial steps is constant.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A key challenge in autoregressive image generation is to efficiently sample independent locations in parallel, while still modeling mutual dependencies with serial conditioning. Some recent works have addressed this by conditioning between scales in a multiscale pyramid. Others have looked at parallelizing samples in a single image using regular partitions or randomized orders. In this work we examine a flexible, fixed ordering based on progressive checkerboards for multiscale autoregressive image generation. Our ordering draws samples in parallel from evenly spaced regions at each scale, maintaining full balance in all levels of a quadtree subdivision at each step. This enables effective conditioning both between and within scales. Intriguingly, we find evidence that in our balanced setting, a wide range of scale-up factors lead to similar results, so long as the total number of serial steps is constant. On class-conditional ImageNet, our method achieves competitive performance compared to recent state-of-the-art autoregressive systems with like model capacity, using fewer sampling steps.
Related papers
- Laplacian Multi-scale Flow Matching for Generative Modeling [23.408491192194926]
We present Laplacian multiscale flow matching (LapFlow), a novel framework that enhances flow matching by leveraging multi-scale representations for image generative modeling.<n>Our approach decomposes images into Laplacian pyramid residuals and processes different scales in parallel through a mixture-of-transformers (MoT) architecture with causal attention mechanisms.
arXiv Detail & Related papers (2026-02-23T03:09:56Z) - Stable Virtual Camera: Generative View Synthesis with Diffusion Models [51.71244310522393]
We present Stable Virtual Camera (Seva), a generalist diffusion model that creates novel views of a scene.<n>Our approach overcomes these limitations through simple model design, optimized training recipe, and flexible sampling strategy.<n>Our method can generate high-quality videos lasting up to half a minute with seamless loop closure.
arXiv Detail & Related papers (2025-03-18T17:57:22Z) - One Diffusion to Generate Them All [54.82732533013014]
OneDiffusion is a versatile, large-scale diffusion model that supports bidirectional image synthesis and understanding.<n>It enables conditional generation from inputs such as text, depth, pose, layout, and semantic maps.<n>OneDiffusion allows for multi-view generation, camera pose estimation, and instant personalization using sequential image inputs.
arXiv Detail & Related papers (2024-11-25T12:11:05Z) - Implicit Grid Convolution for Multi-Scale Image Super-Resolution [6.8410780175245165]
We propose a multi-scale framework that employs a single encoder in conjunction with Implicit Grid Convolution (IGConv)
Our framework achieves comparable performance to existing fixed-scale methods while reducing the training budget and stored parameters three-fold.
arXiv Detail & Related papers (2024-08-19T03:30:15Z) - Learning Images Across Scales Using Adversarial Training [64.59447233902735]
We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images.
We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches.
arXiv Detail & Related papers (2024-06-13T08:44:12Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - Arbitrary-Scale Image Synthesis [149.0290830305808]
Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.
We propose the design of scale-consistent positional encodings invariant to our generator's transformations layers.
We show competitive results for a continuum of scales on various commonly used datasets for image synthesis.
arXiv Detail & Related papers (2022-04-05T15:10:43Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Nested Scale Editing for Conditional Image Synthesis [19.245119912119947]
We propose an image synthesis approach that provides stratified navigation in the latent code space.
With a tiny amount of partial or very low-resolution image, our approach can consistently out-perform state-of-the-art counterparts.
arXiv Detail & Related papers (2020-06-03T04:29:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.