Related papers: Dynamic Narrowing of VAE Bottlenecks Using GECO and L0 Regularization

Dynamic Narrowing of VAE Bottlenecks Using GECO and L0 Regularization

URL: http://arxiv.org/abs/2003.10901v3
Date: Tue, 13 Apr 2021 09:48:24 GMT
Title: Dynamic Narrowing of VAE Bottlenecks Using GECO and L0 Regularization
Authors: Cedric De Boom, Samuel Wauthier, Tim Verbelen, Bart Dhoedt
Abstract summary: We have developed a technique to shrink the latent space dimensionality of VAEs automatically and on-the-fly during training. This paper presents the algorithmic details of our method along with experimental results on five different datasets.
Score: 5.57310999362848
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When designing variational autoencoders (VAEs) or other types of latent space models, the dimensionality of the latent space is typically defined upfront. In this process, it is possible that the number of dimensions is under- or overprovisioned for the application at hand. In case the dimensionality is not predefined, this parameter is usually determined using time- and resource-consuming cross-validation. For these reasons we have developed a technique to shrink the latent space dimensionality of VAEs automatically and on-the-fly during training using Generalized ELBO with Constrained Optimization (GECO) and the $L_0$-Augment-REINFORCE-Merge ($L_0$-ARM) gradient estimator. The GECO optimizer ensures that we are not violating a predefined upper bound on the reconstruction error. This paper presents the algorithmic details of our method along with experimental results on five different datasets. We find that our training procedure is stable and that the latent space can be pruned effectively without violating the GECO constraints.

Related papers

Improving the Generation of VAEs with High Dimensional Latent Spaces by the use of Hyperspherical Coordinates [59.4526726541389]
Variational autoencoders (VAE) encode data into lower-dimensional latent vectors before decoding those vectors back to data.<n>We propose a new parameterization of the latent space with limited computational overhead.
arXiv Detail & Related papers (2025-07-21T05:10:43Z)
Proper Latent Decomposition [4.266376725904727]
We compute a reduced set of intrinsic coordinates (latent space) to accurately describe a flow with fewer degrees of freedom than the numerical discretization. With this proposed numerical framework, we propose an algorithm to perform PLD on the manifold. This work opens opportunities for analyzing autoencoders and latent spaces, nonlinear reduced-order modeling and scientific insights into the structure of high-dimensional data.
arXiv Detail & Related papers (2024-12-01T12:19:08Z)
FLoRA: Low-Rank Core Space for N-dimension [78.39310274926535]
Adapting pre-trained foundation models for various downstream tasks has been prevalent in artificial intelligence. To mitigate this, several fine-tuning techniques have been developed to update the pre-trained model weights in a more resource-efficient manner. This paper introduces a generalized parameter-efficient fine-tuning framework, FLoRA, designed for various dimensional parameter space.
arXiv Detail & Related papers (2024-05-23T16:04:42Z)
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners [6.760212042305871]
We present a novel approach to accelerate gradient descent (SGD) by utilizing curvature information. Our approach involves two preconditioners: a matrix-free preconditioner and a low-rank approximation preconditioner. We demonstrate that Preconditioned SGD (PSGD) outperforms SoTA on Vision, NLP, and RL tasks.
arXiv Detail & Related papers (2024-02-07T03:18:00Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
History Matching for Geological Carbon Storage using Data-Space Inversion with Spatio-Temporal Data Parameterization [0.0]
In data-space inversion (DSI), history-matched quantities of interest are inferred directly, without constructing posterior geomodels. This is accomplished efficiently using a set of O(1000) prior simulation results, data parameterization, and posterior sampling within a Bayesian setting. The new parameterization uses an adversarial autoencoder (AAE) for dimension reduction and a convolutional long short-term memory (convLSTM) network to represent the spatial distribution and temporal evolution of the pressure and saturation fields.
arXiv Detail & Related papers (2023-10-05T00:50:06Z)
SIGMA: Scale-Invariant Global Sparse Shape Matching [50.385414715675076]
We propose a novel mixed-integer programming (MIP) formulation for generating precise sparse correspondences for non-rigid shapes. We show state-of-the-art results for sparse non-rigid matching on several challenging 3D datasets.
arXiv Detail & Related papers (2023-08-16T14:25:30Z)
Random Smoothing Regularization in Kernel Gradient Descent Learning [24.383121157277007]
We present a framework for random smoothing regularization that can adaptively learn a wide range of ground truth functions belonging to the classical Sobolev spaces. Our estimator can adapt to the structural assumptions of the underlying data and avoid the curse of dimensionality.
arXiv Detail & Related papers (2023-05-05T13:37:34Z)
LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks. GANs enable diverse augmentation by learning and sampling from the data distribution. GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation. We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z)
Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications. We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z)
Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting. This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z)
RENs: Relevance Encoding Networks [0.0]
This paper proposes relevance encoding networks (RENs): a novel probabilistic VAE-based framework that uses the automatic relevance determination (ARD) prior in the latent space to learn the data-specific bottleneck dimensionality. We show that the proposed model learns the relevant latent bottleneck dimensionality without compromising the representation and generation quality of the samples.
arXiv Detail & Related papers (2022-05-25T21:53:48Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions. We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.