Related papers: Balancing reconstruction error and Kullback-Leibler divergence in Variational Autoencoders

Balancing reconstruction error and Kullback-Leibler divergence in Variational Autoencoders

URL: http://arxiv.org/abs/2002.07514v1
Date: Tue, 18 Feb 2020 12:22:31 GMT
Title: Balancing reconstruction error and Kullback-Leibler divergence in Variational Autoencoders
Authors: Andrea Asperti, Matteo Trentin
Abstract summary: We show that learning can be replaced by a simple deterministic computation, helping to understand the underlying mechanism. On typical datasets such as Cifar and Celeba, our technique sensibly outperforms all previous VAE architectures.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the loss function of Variational Autoencoders there is a well known tension between two components: the reconstruction loss, improving the quality of the resulting images, and the Kullback-Leibler divergence, acting as a regularizer of the latent space. Correctly balancing these two components is a delicate issue, easily resulting in poor generative behaviours. In a recent work, Dai and Wipf obtained a sensible improvement by allowing the network to learn the balancing factor during training, according to a suitable loss function. In this article, we show that learning can be replaced by a simple deterministic computation, helping to understand the underlying mechanism, and resulting in a faster and more accurate behaviour. On typical datasets such as Cifar and Celeba, our technique sensibly outperforms all previous VAE architectures.

Related papers

Towards the Training of Deeper Predictive Coding Neural Networks [53.15874572081944]
Predictive coding networks trained with equilibrium propagation are neural models that perform inference through an iterative energy process.<n>Previous studies have demonstrated their effectiveness in shallow architectures, but show significant performance degradation when depth exceeds five to seven layers.<n>We show that the reason behind this degradation is due to exponentially imbalanced errors between layers during weight updates, and predictions from the previous layer not being effective in guiding updates in deeper layers.
arXiv Detail & Related papers (2025-06-30T12:44:47Z)
Generalized Kullback-Leibler Divergence Loss [105.66549870868971]
We prove that the Kullback-Leibler (KL) Divergence loss is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss. Thanks to the decoupled structure of DKL loss, we have identified two areas for improvement.
arXiv Detail & Related papers (2025-03-11T04:43:33Z)
Transformer Meets Twicing: Harnessing Unattended Residual Information [2.1605931466490795]
Transformer-based deep learning models have achieved state-of-the-art performance across numerous language and vision tasks. While the self-attention mechanism has proven capable of handling complex data patterns, it has been observed that the representational capacity of the attention matrix degrades significantly across transformer layers. We propose the Twicing Attention, a novel attention mechanism that uses kernel twicing procedure in nonparametric regression to alleviate the low-pass behavior of associated NLM smoothing.
arXiv Detail & Related papers (2025-03-02T01:56:35Z)
Causal Context Adjustment Loss for Learned Image Compression [72.7300229848778]
In recent years, learned image compression (LIC) technologies have surpassed conventional methods notably in terms of rate-distortion (RD) performance. Most present techniques are VAE-based with an autoregressive entropy model, which obviously promotes the RD performance by utilizing the decoded causal context. In this paper, we make the first attempt in investigating the way to explicitly adjust the causal context with our proposed Causal Context Adjustment loss.
arXiv Detail & Related papers (2024-10-07T09:08:32Z)
Sobolev neural network with residual weighting as a surrogate in linear and non-linear mechanics [0.0]
This paper investigates the improvement of the training process by including sensitivity information. In computational mechanics, sensitivities can be applied to neural networks by expanding the training loss function. This improvement is shown in two examples of linear and non-linear material behavior.
arXiv Detail & Related papers (2024-07-23T13:28:07Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks. Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z)
CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning [7.4010632660248765]
A novel architecture is proposed that uses a heterogeneous loss function, called CRC loss, to learn improved visual features. The proposed architecture, called CRC-RL, is shown to outperform the existing state-of-the-art methods on the challenging Deep mind control suite environments.
arXiv Detail & Related papers (2023-01-31T08:41:18Z)
Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning [47.904127007515925]
We study a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction. We prove that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic approximation guarantees as their counterparts. Notably, these are the first finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling.
arXiv Detail & Related papers (2023-01-03T04:09:38Z)
Magic ELF: Image Deraining Meets Association Learning and Transformer [63.761812092934576]
This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining. A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery. Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
arXiv Detail & Related papers (2022-07-21T12:50:54Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
Characterizing the loss landscape of variational quantum circuits [77.34726150561087]
We introduce a way to compute the Hessian of the loss function of VQCs. We show how this information can be interpreted and compared to classical neural networks.
arXiv Detail & Related papers (2020-08-06T17:48:12Z)
Joint learning of variational representations and solvers for inverse problems with partially-observed data [13.984814587222811]
In this paper, we design an end-to-end framework allowing to learn actual variational frameworks for inverse problems in a supervised setting. The variational cost and the gradient-based solver are both stated as neural networks using automatic differentiation for the latter. This leads to a data-driven discovery of variational models.
arXiv Detail & Related papers (2020-06-05T19:53:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.