PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image
Generation
- URL: http://arxiv.org/abs/2204.00833v1
- Date: Sat, 2 Apr 2022 10:55:11 GMT
- Title: PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image
Generation
- Authors: Jing He, Yiyi Zhou, Qi Zhang, Yunhang Shen, Xiaoshuai Sun, Chao Chen,
Rongrong Ji
- Abstract summary: Pixel is a promising research paradigm for image generation, which can well exploit pixel-wise prior knowledge for generation.
In this paper, we propose a progressive pixel synthesis network towards efficient image generation, as Pixel.
With much less expenditure, Pixel obtains new state-of-the-art (SOTA) performance on two benchmark datasets.
- Score: 88.55256389703082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pixel synthesis is a promising research paradigm for image generation, which
can well exploit pixel-wise prior knowledge for generation. However, existing
methods still suffer from excessive memory footprint and computation overhead.
In this paper, we propose a progressive pixel synthesis network towards
efficient image generation, coined as PixelFolder. Specifically, PixelFolder
formulates image generation as a progressive pixel regression problem and
synthesizes images by a multi-stage paradigm, which can greatly reduce the
overhead caused by large tensor transformations. In addition, we introduce
novel pixel folding operations to further improve model efficiency while
maintaining pixel-wise prior knowledge for end-to-end regression. With these
innovative designs, we greatly reduce the expenditure of pixel synthesis, e.g.,
reducing 90% computation and 57% parameters compared to the latest pixel
synthesis method called CIPS. To validate our approach, we conduct extensive
experiments on two benchmark datasets, namely FFHQ and LSUN Church. The
experimental results show that with much less expenditure, PixelFolder obtains
new state-of-the-art (SOTA) performance on two benchmark datasets, i.e., 3.77
FID and 2.45 FID on FFHQ and LSUN Church, respectively. Meanwhile, PixelFolder
is also more efficient than the SOTA methods like StyleGAN2, reducing about 74%
computation and 36% parameters, respectively. These results greatly validate
the effectiveness of the proposed PixelFolder.
Related papers
- Accelerating Image Super-Resolution Networks with Pixel-Level Classification [29.010136088811137]
Pixel-level for Single Image SuperResolution is a novel method designed to distribute computational resources adaptively at the pixel level.
Our method allows for performance and computational cost balance during inference without re-training.
arXiv Detail & Related papers (2024-07-31T08:53:10Z) - An Image is Worth 32 Tokens for Reconstruction and Generation [54.24414696392026]
Transformer-based 1-Dimensional Tokenizer (TiTok) is an innovative approach that tokenizes images into 1D latent sequences.
TiTok achieves competitive performance to state-of-the-art approaches.
Our best-performing variant can significantly surpasses DiT-XL/2 (gFID 2.13 vs. 3.04) while still generating high-quality samples 74x faster.
arXiv Detail & Related papers (2024-06-11T17:59:56Z) - Transformer based Pluralistic Image Completion with Reduced Information Loss [72.92754600354199]
Transformer based methods have achieved great success in image inpainting recently.
They regard each pixel as a token, thus suffering from an information loss issue.
We propose a new transformer based framework called "PUT"
arXiv Detail & Related papers (2024-03-31T01:20:16Z) - Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text
Image Super-Resolution [22.60056946339325]
We propose the Pixel Adapter Module (PAM) based on graph attention to address pixel distortion caused by upsampling.
The PAM effectively captures local structural information by allowing each pixel to interact with its neighbors and update features.
We demonstrate that our proposed method generates high-quality super-resolution images, surpassing existing methods in recognition accuracy.
arXiv Detail & Related papers (2023-09-16T08:12:12Z) - CoordFill: Efficient High-Resolution Image Inpainting via Parameterized
Coordinate Querying [52.91778151771145]
In this paper, we try to break the limitations for the first time thanks to the recent development of continuous implicit representation.
Experiments show that the proposed method achieves real-time performance on the 2048$times$2048 images using a single GTX 2080 Ti GPU.
arXiv Detail & Related papers (2023-03-15T11:13:51Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - Parallel Discrete Convolutions on Adaptive Particle Representations of
Images [2.362412515574206]
We present data structures and algorithms for native implementations of discrete convolution operators over Adaptive Particle Representations.
The APR is a content-adaptive image representation that locally adapts the sampling resolution to the image signal.
We show that APR convolution naturally leads to scale-adaptive algorithms that efficiently parallelize on multi-core CPU and GPU architectures.
arXiv Detail & Related papers (2021-12-07T09:40:05Z) - SIN:Superpixel Interpolation Network [9.046310874823002]
Traditional algorithms and deep learning-based algorithms are two main streams in superpixel segmentation.
In this paper, we propose a deep learning-based superpixel segmentation algorithm SIN which can be integrated with downstream tasks in an end-to-end way.
arXiv Detail & Related papers (2021-10-17T02:21:11Z) - Generating Superpixels for High-resolution Images with Decoupled Patch
Calibration [82.21559299694555]
Patch Networks (PCNet) is designed to efficiently and accurately implement high-resolution superpixel segmentation.
DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
arXiv Detail & Related papers (2021-08-19T10:33:05Z) - Implicit Integration of Superpixel Segmentation into Fully Convolutional
Networks [11.696069523681178]
We propose a way to implicitly integrate a superpixel scheme into CNNs.
Our proposed method hierarchically groups pixels at downsampling layers and generates superpixels.
We evaluate our method on several tasks such as semantic segmentation, superpixel segmentation, and monocular depth estimation.
arXiv Detail & Related papers (2021-03-05T02:20:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.