Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs
- URL: http://arxiv.org/abs/2403.18535v1
- Date: Wed, 27 Mar 2024 13:11:34 GMT
- Title: Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs
- Authors: Yichi Zhang, Zhihao Duan, Yuning Huang, Fengqing Zhu,
- Abstract summary: Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory.
VAEs estimate the theoretical upper bound of the information rate-distortion function of images.
To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for neural image codecs.
- Score: 11.729071258457138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance. We implement the BG-VAE using Hierarchical VAEs and demonstrate its effectiveness through extensive experiments. Along with advanced neural network blocks, we provide a versatile, variable-rate NIC that outperforms existing methods when considering both rate-distortion performance and computational complexity. The code is available at BG-VAE.
Related papers
- A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate.
This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z) - On the Convergence Analysis of Over-Parameterized Variational Autoencoders: A Neural Tangent Kernel Perspective [7.580900499231056]
Variational Auto-Encoders (VAEs) have emerged as powerful probabilistic models for generative tasks.
This paper provides a mathematical proof of VAE under mild assumptions.
We also establish a novel connection between the optimization problem faced by over-Eized SNNs and the Kernel Ridge (KRR) problem.
arXiv Detail & Related papers (2024-09-09T06:10:31Z) - A Sampling Theory Perspective on Activations for Implicit Neural
Representations [73.6637608397055]
Implicit Neural Representations (INRs) have gained popularity for encoding signals as compact, differentiable entities.
We conduct a comprehensive analysis of these activations from a sampling theory perspective.
Our investigation reveals that sinc activations, previously unused in conjunction with INRs, are theoretically optimal for signal encoding.
arXiv Detail & Related papers (2024-02-08T05:52:45Z) - On the Disconnect Between Theory and Practice of Neural Networks: Limits of the NTK Perspective [9.753461673117362]
The neural tangent kernel (NTK) has garnered significant attention as a theoretical framework for describing the behavior of large-scale neural networks.
Current results quantifying the rate of convergence to the kernel regime suggest that exploiting these benefits requires architectures that are orders of magnitude wider than they are deep.
This paper investigates whether the limiting regime predicts practically relevant behavior of large-width architectures.
arXiv Detail & Related papers (2023-09-29T20:51:24Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Training Sparse Neural Network by Constraining Synaptic Weight on Unit
Lp Sphere [2.429910016019183]
constraining the synaptic weights on unit Lp-sphere enables the flexibly control of the sparsity with p.
Our approach is validated by experiments on benchmark datasets covering a wide range of domains.
arXiv Detail & Related papers (2021-03-30T01:02:31Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Optimization and Generalization Analysis of Transduction through
Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing.
Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem.
We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.