Normalizing Flows are Capable Generative Models
- URL: http://arxiv.org/abs/2412.06329v2
- Date: Tue, 10 Dec 2024 03:19:52 GMT
- Title: Normalizing Flows are Capable Generative Models
- Authors: Shuangfei Zhai, Ruixiang Zhang, Preetum Nakkiran, David Berthelot, Jiatao Gu, Huangjie Zheng, Tianrong Chen, Miguel Angel Bautista, Navdeep Jaitly, Josh Susskind,
- Abstract summary: TarFlow is a simple and scalable architecture that enables highly performant NF models.<n>It is straightforward to train end-to-end, and capable of directly modeling and generating pixels.<n>TarFlow sets new state-of-the-art results on likelihood estimation for images, beating the previous best methods by a large margin.
- Score: 48.31226028595099
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Normalizing Flows (NFs) are likelihood-based models for continuous inputs. They have demonstrated promising results on both density estimation and generative modeling tasks, but have received relatively little attention in recent years. In this work, we demonstrate that NFs are more powerful than previously believed. We present TarFlow: a simple and scalable architecture that enables highly performant NF models. TarFlow can be thought of as a Transformer-based variant of Masked Autoregressive Flows (MAFs): it consists of a stack of autoregressive Transformer blocks on image patches, alternating the autoregression direction between layers. TarFlow is straightforward to train end-to-end, and capable of directly modeling and generating pixels. We also propose three key techniques to improve sample quality: Gaussian noise augmentation during training, a post training denoising procedure, and an effective guidance method for both class-conditional and unconditional settings. Putting these together, TarFlow sets new state-of-the-art results on likelihood estimation for images, beating the previous best methods by a large margin, and generates samples with quality and diversity comparable to diffusion models, for the first time with a stand-alone NF model. We make our code available at https://github.com/apple/ml-tarflow.
Related papers
- STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis [44.2114053357308]
We present a scalable generative model based on normalizing flows that achieves strong performance in high-resolution image synthesis.<n>The core of STARFlow is Transformer Autoregressive Flow (TARFlow), which combines the expressive power of normalizing flows with the structured modeling capabilities of Autoregressive Transformers.
arXiv Detail & Related papers (2025-06-06T17:58:39Z) - LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation [1.1847464266302488]
Flow Matching (FM) is a powerful generative modeling paradigm based on a simulation-free training objective instead of a score-based one used in DMs.<n>We present Learned Distribution-guided Flow Matching (LeDiFlow), a novel scalable method for training FM-based image generation models.<n>Our method utilizes a State-Of-The-Art (SOTA) transformer architecture combined with latent space sampling and can be trained on a consumer workstation.
arXiv Detail & Related papers (2025-05-27T05:07:37Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - Accelerate TarFlow Sampling with GS-Jacobi Iteration [10.411098875443043]
We show that through a series of optimization strategies, TarFlow sampling can be greatly accelerated by using the Gauss-Seidel-Jacobi (abbreviated as GS-Jacobi) iteration method.<n> Experiments on four TarFlow models demonstrate that GS-Jacobi sampling can significantly enhance sampling efficiency while maintaining the quality of generated images.
arXiv Detail & Related papers (2025-05-19T08:35:44Z) - Gaussian Mixture Flow Matching Models [51.976452482535954]
Diffusion models approximate the denoising distribution as a Gaussian and predict its mean, whereas flow matching models re parameterize the Gaussian mean as flow velocity.
They underperform in few-step sampling due to discretization error and tend to produce over-saturated colors under classifier-free guidance (CFG)
We introduce a novel probabilistic guidance scheme that mitigates the over-saturation issues of CFG and improves image generation quality.
arXiv Detail & Related papers (2025-04-07T17:59:42Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling [49.215957313126324]
Recently developed generative methods, including invertible rescaling network (IRN) based and generative adversarial network (GAN) based methods, have demonstrated exceptional performance in image rescaling.
However, IRN-based methods tend to produce over-smoothed results, while GAN-based methods easily generate fake details.
We propose Boundary-aware Decoupled Flow Networks (BDFlow) to generate realistic and visually pleasing results.
arXiv Detail & Related papers (2024-05-05T14:05:33Z) - PaddingFlow: Improving Normalizing Flows with Padding-Dimensional Noise [4.762593660623934]
We propose PaddingFlow, a novel dequantization method, which improves normalizing flows with padding-dimensional noise.
We validate our method on the main benchmarks of unconditional density estimation.
The results show that PaddingFlow can perform better in all experiments in this paper.
arXiv Detail & Related papers (2024-03-13T03:28:39Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Improving and generalizing flow-based generative models with minibatch
optimal transport [90.01613198337833]
We introduce the generalized conditional flow matching (CFM) technique for continuous normalizing flows (CNFs)
CFM features a stable regression objective like that used to train the flow in diffusion models but enjoys the efficient inference of deterministic flow models.
A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference.
arXiv Detail & Related papers (2023-02-01T14:47:17Z) - Flow Matching for Generative Modeling [44.66897082688762]
Flow Matching is a simulation-free approach for training Continuous Normalizing Flows (CNFs)
We find that employing FM with diffusion paths results in a more robust and stable alternative for training diffusion models.
Training CNFs using Flow Matching on ImageNet leads to state-of-the-art performance in terms of both likelihood and sample quality.
arXiv Detail & Related papers (2022-10-06T08:32:20Z) - SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds [15.476426879806134]
Flow-based generative models are composed of invertible transformations between two random variables of the same dimension.
In this paper, we propose SoftFlow, a probabilistic framework for training normalizing flows on manifold.
We experimentally show that SoftFlow can capture the innate structure of the manifold data and generate high-quality samples.
We apply the proposed framework to 3D point clouds to alleviate the difficulty of forming thin structures for flow-based models.
arXiv Detail & Related papers (2020-06-08T13:56:07Z) - Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow [16.41460104376002]
We introduce subset flows, a class of flows that can transform finite volumes and allow exact computation of likelihoods for discrete data.
We identify ordinal discrete autoregressive models, including WaveNets, PixelCNNs and Transformers, as single-layer flows.
We demonstrate state-of-the-art results on CIFAR-10 for flow models trained with dequantization.
arXiv Detail & Related papers (2020-02-06T22:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.