Normalizing Flows with Multi-Scale Autoregressive Priors
- URL: http://arxiv.org/abs/2004.03891v1
- Date: Wed, 8 Apr 2020 09:07:11 GMT
- Title: Normalizing Flows with Multi-Scale Autoregressive Priors
- Authors: Shweta Mahajan, Apratim Bhattacharyya, Mario Fritz, Bernt Schiele,
Stefan Roth
- Abstract summary: We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR)
Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data.
We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
- Score: 131.895570212956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flow-based generative models are an important class of exact inference models
that admit efficient inference and sampling for image synthesis. Owing to the
efficiency constraints on the design of the flow layers, e.g. split coupling
flow layers in which approximately half the pixels do not undergo further
transformations, they have limited expressiveness for modeling long-range data
dependencies compared to autoregressive models that rely on conditional
pixel-wise generation. In this work, we improve the representational power of
flow-based models by introducing channel-wise dependencies in their latent
space through multi-scale autoregressive priors (mAR). Our mAR prior for models
with split coupling flow layers (mAR-SCF) can better capture dependencies in
complex multimodal data. The resulting model achieves state-of-the-art density
estimation results on MNIST, CIFAR-10, and ImageNet. Furthermore, we show that
mAR-SCF allows for improved image generation quality, with gains in FID and
Inception scores compared to state-of-the-art flow-based models.
Related papers
- Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis [62.06970466554273]
We present Meissonic, which non-autoregressive masked image modeling (MIM) text-to-image elevates to a level comparable with state-of-the-art diffusion models like SDXL.
We leverage high-quality training data, integrate micro-conditions informed by human preference scores, and employ feature compression layers to further enhance image fidelity and resolution.
Our model not only matches but often exceeds the performance of existing models like SDXL in generating high-quality, high-resolution images.
arXiv Detail & Related papers (2024-10-10T17:59:17Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - Boosting Flow-based Generative Super-Resolution Models via Learned Prior [8.557017814978334]
Flow-based super-resolution (SR) models have demonstrated astonishing capabilities in generating high-quality images.
These methods encounter several challenges during image generation, such as grid artifacts, exploding inverses, and suboptimal results due to a fixed sampling temperature.
This work introduces a conditional learned prior to the inference phase of a flow-based SR model.
arXiv Detail & Related papers (2024-03-16T18:04:12Z) - Poisson flow consistency models for low-dose CT image denoising [3.6218104434936658]
We introduce a novel image denoising technique which combines the flexibility afforded in Poisson flow generative models (PFGM)++ with the, high quality, single step sampling of consistency models.
Our results indicate that the added flexibility of tuning the hyper parameter D, the dimensionality of the augmentation variables in PFGM++, allows us to outperform consistency models.
arXiv Detail & Related papers (2024-02-13T01:39:56Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - RG-Flow: A hierarchical and explainable flow model based on
renormalization group and sparse prior [2.274915755738124]
Flow-based generative models have become an important class of unsupervised learning approaches.
In this work, we incorporate the key ideas of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, RG-Flow.
Our proposed method has $O(log L)$ complexity for inpainting of an image with edge length $L$, compared to previous generative models with $O(L2)$ complexity.
arXiv Detail & Related papers (2020-09-30T18:04:04Z) - Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow [16.41460104376002]
We introduce subset flows, a class of flows that can transform finite volumes and allow exact computation of likelihoods for discrete data.
We identify ordinal discrete autoregressive models, including WaveNets, PixelCNNs and Transformers, as single-layer flows.
We demonstrate state-of-the-art results on CIFAR-10 for flow models trained with dequantization.
arXiv Detail & Related papers (2020-02-06T22:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.