Shrink the longest: improving latent space isotropy with symplicial geometry
- URL: http://arxiv.org/abs/2501.05502v1
- Date: Thu, 09 Jan 2025 18:44:10 GMT
- Title: Shrink the longest: improving latent space isotropy with symplicial geometry
- Authors: Sergei Kudriashov, Olesya Karpik, Eduard Klyshinsky,
- Abstract summary: We propose a novel regularization technique based on simplicial geometry to improve the isotropy of latent representations.
We demonstrate that the method leads to an increase in downstream performance while significantly lowering the anisotropy during fine-tuning.
- Score: 0.0
- License:
- Abstract: Although transformer-based models have been dominating the field of deep learning, various studies of their embedding space have shown that they suffer from "representation degeneration problem": embeddings tend to be distributed in a narrow cone, making the latent space highly anisotropic. Increasing the isotropy has shown to improve performance in downstream tasks both in static and contextual language models. However, most of approaches either add inference overhead or require substantial amount of data for model reparametrization. We propose a novel regularization technique based on simplicial geometry to improve the isotropy of latent representations. The core idea of our method is based on maximizing the persistent entropy of barcodes obtained using Vietoris-Rips filtration from contextual embeddings in the underlying latent space. We demonstrate that the method leads to an increase in downstream performance while significantly lowering the anisotropy during fine-tuning by exploiting existing geometric structures instead of reparametrization.
Related papers
- Point Cloud Resampling with Learnable Heat Diffusion [58.050130177241186]
We propose a learnable heat diffusion framework for point cloud resampling.
Unlike previous diffusion models with a fixed prior, the adaptive conditional prior selectively preserves geometric features of the point cloud.
arXiv Detail & Related papers (2024-11-21T13:44:18Z) - On Probabilistic Pullback Metrics on Latent Hyperbolic Manifolds [5.724027955589408]
This paper focuses on the hyperbolic manifold, a particularly suitable choice for modeling hierarchical relationships.
We propose augmenting the hyperbolic metric with a pullback metric to account for distortions introduced by the LVM's nonlinear mapping.
Our experiments demonstrate that geodesics on the pullback metric not only respect the geometry of the hyperbolic latent space but also align with the underlying data distribution.
arXiv Detail & Related papers (2024-10-28T09:13:00Z) - Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD)
This method systematically explores hierarchical layers within the generative adversarial networks (GANs)
In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z) - A Slices Perspective for Incremental Nonparametric Inference in High Dimensional State Spaces [25.16567521220103]
We introduce an innovative method for incremental nonparametric probabilistic inference in high-dimensional state spaces.
Our approach leverages slices from high-dimensional surfaces to efficiently approximate posterior distributions of any shape.
arXiv Detail & Related papers (2024-05-26T06:52:56Z) - Hyperbolic Geometric Latent Diffusion Model for Graph Generation [27.567428462212455]
Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation.
In this paper, we propose a novel geometrically latent diffusion framework HypDiff.
Specifically, we first establish a geometrically latent space with interpretability measures based on hyperbolic geometry, to define anisotropic latent diffusion processes for graphs.
Then, we propose a geometrically latent diffusion process that is constrained by both radial and angular geometric properties, thereby ensuring the preservation of the original topological properties in the generative graphs.
arXiv Detail & Related papers (2024-05-06T06:28:44Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding.
The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform.
Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - How Does Fine-tuning Affect the Geometry of Embedding Space: A Case
Study on Isotropy [18.490856440975996]
We analyze the extent to which the isotropy of the embedding space changes after fine-tuning.
Local structures in pre-trained contextual word representations (CWRs) undergo a massive change during fine-tuning.
arXiv Detail & Related papers (2021-09-10T08:58:59Z) - Intermediate Layer Optimization for Inverse Problems using Deep
Generative Models [86.29330440222199]
ILO is a novel optimization algorithm for solving inverse problems with deep generative models.
We empirically show that our approach outperforms state-of-the-art methods introduced in StyleGAN-2 and PULSE for a wide range of inverse problems.
arXiv Detail & Related papers (2021-02-15T06:52:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.