Understanding Variational Autoencoders with Intrinsic Dimension and Information Imbalance
- URL: http://arxiv.org/abs/2411.01978v1
- Date: Mon, 04 Nov 2024 10:58:41 GMT
- Title: Understanding Variational Autoencoders with Intrinsic Dimension and Information Imbalance
- Authors: Charles Camboulin, Diego Doimo, Aldo Glielmo,
- Abstract summary: This work presents an analysis of the hidden representations of Variational Autoencoders (VAEs) using the Intrinsic Dimension (ID) and the Information Imbalance (II)
We show that VAEs undergo a transition in behaviour once the bottleneck size is larger than the ID of the data, manifesting in a double hunchback ID profile and a qualitative shift in information processing as captured by the II.
- Score: 2.7446241148152257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents an analysis of the hidden representations of Variational Autoencoders (VAEs) using the Intrinsic Dimension (ID) and the Information Imbalance (II). We show that VAEs undergo a transition in behaviour once the bottleneck size is larger than the ID of the data, manifesting in a double hunchback ID profile and a qualitative shift in information processing as captured by the II. Our results also highlight two distinct training phases for architectures with sufficiently large bottleneck sizes, consisting of a rapid fit and a slower generalisation, as assessed by a differentiated behaviour of ID, II, and KL loss. These insights demonstrate that II and ID could be valuable tools for aiding architecture search, for diagnosing underfitting in VAEs, and, more broadly, they contribute to advancing a unified understanding of deep generative models through geometric analysis.
Related papers
- Measuring Intrinsic Dimension of Token Embeddings [0.13108652488669734]
We estimate the ID of token embeddings in small-scale language models and also modern large language models.
We observe an increase in redundancy rates as the model scale grows.
When LoRA is applied to the embedding layers, we observe a sudden drop in perplexity around the estimated IDs.
arXiv Detail & Related papers (2025-03-04T00:19:01Z) - Explainable AI for Multivariate Time Series Pattern Exploration: Latent Space Visual Analytics with Temporal Fusion Transformer and Variational Autoencoders in Power Grid Event Diagnosis [1.170167705525779]
This paper proposes a novel visual analytics framework that integrates two generative AI models, Temporal Fusion Transformer (TFT) and Variational Autoencoders (VAEs)
It reduces complex patterns into lower-dimensional latent spaces and visualizes them in 2D using dimensionality reduction techniques such as PCA, t-SNE, and UMAP with DBSCAN.
The framework is demonstrated through a case study on power grid signal data, where it identifies multi-label grid event signatures, including faults and anomalies with diverse root causes.
arXiv Detail & Related papers (2024-12-20T17:41:11Z) - It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment [72.75844404617959]
This paper proposes a novel cross-granularity alignment gait recognition method, named XGait.
To achieve this goal, the XGait first contains two branches of backbone encoders to map the silhouette sequences and the parsing sequences into two latent spaces.
Comprehensive experiments on two large-scale gait datasets show XGait with the Rank-1 accuracy of 80.5% on Gait3D and 88.3% CCPG.
arXiv Detail & Related papers (2024-11-16T08:54:27Z) - Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Diffusion Bridge AutoEncoders for Unsupervised Representation Learning [10.74555302283403]
We introduce Diffusion Bridge AuteEncoders (DBAE), which enable z-dependent endpoint xT inference through a feed-forward architecture.
We propose an objective function for DBAE to enable both reconstruction and generative modeling, with their theoretical justification.
arXiv Detail & Related papers (2024-05-27T12:28:17Z) - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising.
We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z) - DAGnosis: Localized Identification of Data Inconsistencies using
Structures [73.39285449012255]
Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models.
We use directed acyclic graphs (DAGs) to encode the training set's features probability distribution and independencies as a structure.
Our method, called DAGnosis, leverages these structural interactions to bring valuable and insightful data-centric conclusions.
arXiv Detail & Related papers (2024-02-26T11:29:16Z) - UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
We propose UGMAE, a unified framework for graph masked autoencoders.
We first develop an adaptive feature mask generator to account for the unique significance of nodes.
We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information.
arXiv Detail & Related papers (2024-02-12T19:39:26Z) - Supervision Adaptation Balancing In-distribution Generalization and
Out-of-distribution Detection [36.66825830101456]
In-distribution (ID) and out-of-distribution (OOD) samples can lead to textitdistributional vulnerability in deep neural networks.
We introduce a novel textitsupervision adaptation approach to generate adaptive supervision information for OOD samples, making them more compatible with ID samples.
arXiv Detail & Related papers (2022-06-19T11:16:44Z) - Image-based Automated Species Identification: Can Virtual Data
Augmentation Overcome Problems of Insufficient Sampling? [0.0]
We present a two-level data augmentation approach to automated visual species identification.
The first level of data augmentation applies classic approaches of data augmentation and generation of faked images.
The second level of data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space.
arXiv Detail & Related papers (2020-10-18T15:44:45Z) - Longitudinal Variational Autoencoder [1.4680035572775534]
A common approach to analyse high-dimensional data that contains missing values is to learn a low-dimensional representation using variational autoencoders (VAEs)
Standard VAEs assume that the learnt representations are i.i.d., and fail to capture the correlations between the data samples.
We propose the Longitudinal VAE (L-VAE), that uses a multi-output additive Gaussian process (GP) prior to extend the VAE's capability to learn structured low-dimensional representations.
Our approach can simultaneously accommodate both time-varying shared and random effects, produce structured low-dimensional representations
arXiv Detail & Related papers (2020-06-17T10:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.