Hierarchical Residual Learning Based Vector Quantized Variational
Autoencoder for Image Reconstruction and Generation
- URL: http://arxiv.org/abs/2208.04554v1
- Date: Tue, 9 Aug 2022 06:04:25 GMT
- Title: Hierarchical Residual Learning Based Vector Quantized Variational
Autoencoder for Image Reconstruction and Generation
- Authors: Mohammad Adiban and Kalin Stefanov and Sabato Marco Siniscalchi and
Giampiero Salvi
- Abstract summary: We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data.
We evaluate our method on the tasks of image reconstruction and generation.
- Score: 19.92324010429006
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a multi-layer variational autoencoder method, we call HR-VQVAE,
that learns hierarchical discrete representations of the data. By utilizing a
novel objective function, each layer in HR-VQVAE learns a discrete
representation of the residual from previous layers through a vector quantized
encoder. Furthermore, the representations at each layer are hierarchically
linked to those at previous layers. We evaluate our method on the tasks of
image reconstruction and generation. Experimental results demonstrate that the
discrete representations learned by HR-VQVAE enable the decoder to reconstruct
high-quality images with less distortion than the baseline methods, namely
VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images
that outperform state-of-the-art generative models, providing further
verification of the efficiency of the learned representations. The hierarchical
nature of HR-VQVAE i) reduces the decoding search time, making the method
particularly suitable for high-load tasks and ii) allows to increase the
codebook size without incurring the codebook collapse problem.
Related papers
- HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes [18.57499609338579]
We propose a novel framework to learn hierarchical discrete representation on the basis of the variational Bayes framework, called hierarchically quantized variational autoencoder (HQ-VAE)
HQ-VAE naturally generalizes the hierarchical variants of VQ-VAE, such as VQ-VAE-2 and residual-quantized VAE (RQ-VAE)
Our comprehensive experiments on image datasets show that HQ-VAE enhances codebook usage and improves reconstruction performance.
arXiv Detail & Related papers (2023-12-31T01:39:38Z) - Not All Image Regions Matter: Masked Vector Quantization for
Autoregressive Image Generation [78.13793505707952]
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
We propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) Stack model from modeling redundancy.
arXiv Detail & Related papers (2023-05-23T02:15:53Z) - Vector Quantized Wasserstein Auto-Encoder [57.29764749855623]
We study learning deep discrete representations from the generative viewpoint.
We endow discrete distributions over sequences of codewords and learn a deterministic decoder that transports the distribution over the sequences of codewords to the data distribution.
We develop further theories to connect it with the clustering viewpoint of WS distance, allowing us to have a better and more controllable clustering solution.
arXiv Detail & Related papers (2023-02-12T13:51:36Z) - Optimizing Hierarchical Image VAEs for Sample Quality [0.0]
hierarchical variational autoencoders (VAEs) have achieved great density estimation on image modeling tasks.
We attribute this to learned representations that over-emphasize compressing imperceptible details of the image.
We introduce a KL-reweighting strategy to control the amount of infor mation in each latent group, and employ a Gaussian output layer to reduce sharpness in the learning objective.
arXiv Detail & Related papers (2022-10-18T23:10:58Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Autoregressive Image Generation using Residual Quantization [40.04085054791994]
We propose a two-stage framework to generate high-resolution images.
The framework consists of Residual-Quantized VAE (RQ-VAE) and RQ-Transformer.
Our approach has a significantly faster sampling speed than previous AR models to generate high-quality images.
arXiv Detail & Related papers (2022-03-03T11:44:46Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - Hierarchical Quantized Autoencoders [3.9146761527401432]
We motivate the use of a hierarchy of Vector Quantized Variencoders (VQ-VAEs) to attain high factors of compression.
We show that a combination of quantization and hierarchical latent structure aids likelihood-based image compression.
Our resulting scheme produces a Markovian series of latent variables that reconstruct images of high-perceptual quality.
arXiv Detail & Related papers (2020-02-19T11:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.