Related papers: Connecting Neural Models Latent Geometries with Relative Geodesic Representations

Connecting Neural Models Latent Geometries with Relative Geodesic Representations

URL: http://arxiv.org/abs/2506.01599v1
Date: Mon, 02 Jun 2025 12:34:55 GMT
Title: Connecting Neural Models Latent Geometries with Relative Geodesic Representations
Authors: Hanlin Yu, Berfin Inal, Georgios Arvanitidis, Soren Hauberg, Francesco Locatello, Marco Fumero,
Abstract summary: We show that when a latent structure is shared between distinct latent spaces, relative distances between representations can be preserved, up to distortions.<n>We assume that distinct neural models parametrize approximately the same underlying manifold, and introduce a representation based on the pullback metric.<n>We validate our method on model stitching and retrieval tasks, covering autoencoders and vision foundation discriminative models.
Score: 21.71782603770616
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural models learn representations of high-dimensional data on low-dimensional manifolds. Multiple factors, including stochasticities in the training process, model architectures, and additional inductive biases, may induce different representations, even when learning the same task on the same data. However, it has recently been shown that when a latent structure is shared between distinct latent spaces, relative distances between representations can be preserved, up to distortions. Building on this idea, we demonstrate that exploiting the differential-geometric structure of latent spaces of neural models, it is possible to capture precisely the transformations between representational spaces trained on similar data distributions. Specifically, we assume that distinct neural models parametrize approximately the same underlying manifold, and introduce a representation based on the pullback metric that captures the intrinsic structure of the latent space, while scaling efficiently to large models. We validate experimentally our method on model stitching and retrieval tasks, covering autoencoders and vision foundation discriminative models, across diverse architectures, datasets, and pretraining schemes.

Related papers

Hierarchical Stochastic Differential Equation Models for Latent Manifold Learning in Neural Time Series [0.0]
We propose a novel hierarchical differential equation (SDE) model that balances computational efficiency and interpretability.<n>We derive training and inference procedures and show that the computational cost of inference scales linearly with the length of the observation data.
arXiv Detail & Related papers (2025-07-29T06:51:58Z)
Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large Models [6.390475802910619]
We show that representations learned across models trained on the same data can be expressed as linear combinations of a emphuniversal set of basis features.<n>These basis features underlie the learning task itself and remain consistent across models, regardless of scale.
arXiv Detail & Related papers (2025-05-31T17:45:18Z)
Latent Variable Sequence Identification for Cognitive Models with Neural Network Estimators [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.<n>Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration [50.62725807357586]
This article presents a general Bayesian learning framework for multi-modal groupwise image registration. We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables. Experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images.
arXiv Detail & Related papers (2024-01-04T08:46:39Z)
Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z)
VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables. The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning. We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z)
Similarity of Neural Architectures using Adversarial Attack Transferability [47.66096554602005]
We design a quantitative and scalable similarity measure between neural architectures. We conduct a large-scale analysis on 69 state-of-the-art ImageNet classifiers. Our results provide insights into why developing diverse neural architectures with distinct components is necessary.
arXiv Detail & Related papers (2022-10-20T16:56:47Z)
Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z)
Relative representations enable zero-shot latent space communication [19.144630518400604]
Neural networks embed the geometric structure of a data manifold lying in a high-dimensional space into latent representations. We show how neural architectures can leverage these relative representations to guarantee, in practice, latent isometry invariance.
arXiv Detail & Related papers (2022-09-30T12:37:03Z)
On the Symmetries of Deep Learning Models and their Internal Representations [1.418465438044804]
We seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data.
arXiv Detail & Related papers (2022-05-27T22:29:08Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Lossless Compression of Structured Convolutional Models via Lifting [14.63152363481139]
We introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models.
arXiv Detail & Related papers (2020-07-13T08:02:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.