Metric Space Magnitude for Evaluating the Diversity of Latent Representations
- URL: http://arxiv.org/abs/2311.16054v4
- Date: Thu, 31 Oct 2024 13:36:40 GMT
- Title: Metric Space Magnitude for Evaluating the Diversity of Latent Representations
- Authors: Katharina Limbeck, Rayna Andreeva, Rik Sarkar, Bastian Rieck,
- Abstract summary: We develop a family of magnitude-based measures of the intrinsic diversity of latent representations.
Our measures are provably stable under perturbations of the data, can be efficiently calculated, and enable a rigorous multi-scale characterisation and comparison of latent representations.
We show their utility and superior performance across different domains and tasks, including (i) the automated estimation of diversity, (ii) the detection of mode collapse, and (iii) the evaluation of generative models for text, image, and graph data.
- Score: 13.272500655475486
- License:
- Abstract: The magnitude of a metric space is a novel invariant that provides a measure of the 'effective size' of a space across multiple scales, while also capturing numerous geometrical properties, such as curvature, density, or entropy. We develop a family of magnitude-based measures of the intrinsic diversity of latent representations, formalising a novel notion of dissimilarity between magnitude functions of finite metric spaces. Our measures are provably stable under perturbations of the data, can be efficiently calculated, and enable a rigorous multi-scale characterisation and comparison of latent representations. We show their utility and superior performance across different domains and tasks, including (i) the automated estimation of diversity, (ii) the detection of mode collapse, and (iii) the evaluation of generative models for text, image, and graph data.
Related papers
- Uncertainty quantification in metric spaces [3.637162892228131]
This paper introduces a novel uncertainty quantification framework for regression models where the response takes values in a separable metric space.
The proposed algorithms can efficiently handle large datasets and are agnostic to the predictive base model used.
arXiv Detail & Related papers (2024-05-08T15:06:02Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for data-efficient representation learning.
We establish relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - Reliable Measures of Spread in High Dimensional Latent Spaces [0.0]
We argue that the commonly used measures of data spread do not provide reliable metrics to compare the use of latent space across models.
We propose and examine eight alternative measures of data spread, all but one of which improve over these current metrics.
Of our proposed measures, we recommend one principal component-based measure and one entropy-based measure.
arXiv Detail & Related papers (2022-12-15T22:15:11Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Learning Conditional Invariance through Cycle Consistency [60.85059977904014]
We propose a novel approach to identify meaningful and independent factors of variation in a dataset.
Our method involves two separate latent subspaces for the target property and the remaining input information.
We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models.
arXiv Detail & Related papers (2021-11-25T17:33:12Z) - Evidential Sparsification of Multimodal Latent Spaces in Conditional
Variational Autoencoders [63.46738617561255]
We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder.
We use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not.
Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique.
arXiv Detail & Related papers (2020-10-19T01:27:21Z) - High-Dimensional Non-Parametric Density Estimation in Mixed Smooth
Sobolev Spaces [31.663702435594825]
Density estimation plays a key role in many tasks in machine learning, statistical inference, and visualization.
Main bottleneck in high-dimensional density estimation is the prohibitive computational cost and the slow convergence rate.
We propose novel estimators for high-dimensional non-parametric density estimation called the adaptive hyperbolic cross density estimators.
arXiv Detail & Related papers (2020-06-05T21:27:59Z) - Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics
for Text Collections [23.008385862718036]
We propose metrics of diversity, density, and homogeneity that quantitatively measure the dispersion, sparsity, and uniformity of a text collection.
Experiments on real-world datasets demonstrate that the proposed characteristic metrics are highly correlated with text classification performance of a renowned model, BERT.
arXiv Detail & Related papers (2020-03-19T00:48:32Z) - Learning Flat Latent Manifolds with VAEs [16.725880610265378]
We propose an extension to the framework of variational auto-encoders, where the Euclidean metric is a proxy for the similarity between data points.
We replace the compact prior typically used in variational auto-encoders with a recently presented, more expressive hierarchical one.
We evaluate our method on a range of data-sets, including a video-tracking benchmark.
arXiv Detail & Related papers (2020-02-12T09:54:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.