The Geometric Mechanics of Contrastive Representation Learning: Alignment Potentials, Entropic Dispersion, and Cross-Modal Divergence
- URL: http://arxiv.org/abs/2601.19597v1
- Date: Tue, 27 Jan 2026 13:33:03 GMT
- Title: The Geometric Mechanics of Contrastive Representation Learning: Alignment Potentials, Entropic Dispersion, and Cross-Modal Divergence
- Authors: Yichao Cai, Zhen Zhang, Yuhang Liu, Javen Qinfeng Shi,
- Abstract summary: We present a measure-theoretic framework that models learning as the evolution of representation measures on a fixed embedding manifold.<n>By establishing value and consistency in the large-batch limit, we bridge the misalignment objective to explicit energy landscapes.<n>We show that this term induces barrier-driven co-adaptation, enforcing a population-level modality gap as a structural geometric necessity.
- Score: 17.501700376593174
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: While InfoNCE powers modern contrastive learning, its geometric mechanisms remain under-characterized beyond the canonical alignment--uniformity decomposition. We present a measure-theoretic framework that models learning as the evolution of representation measures on a fixed embedding manifold. By establishing value and gradient consistency in the large-batch limit, we bridge the stochastic objective to explicit deterministic energy landscapes, uncovering a fundamental geometric bifurcation between the unimodal and multimodal regimes. In the unimodal setting, the intrinsic landscape is strictly convex with a unique Gibbs equilibrium; here, entropy acts merely as a tie-breaker, clarifying "uniformity" as a constrained expansion within the alignment basin. In contrast, the symmetric multimodal objective contains a persistent negative symmetric divergence term that remains even after kernel sharpening. We show that this term induces barrier-driven co-adaptation, enforcing a population-level modality gap as a structural geometric necessity rather than an initialization artifact. Our results shift the analytical lens from pointwise discrimination to population geometry, offering a principled basis for diagnosing and controlling distributional misalignment.
Related papers
- Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits [72.0643009153473]
We show that expressive variational ansatze enter a Haar-like universality class in which both observable expectation values and parameter gradients concentrate exponentially with system size.<n>As a consequence, the hypothesis class induced by such circuits collapses with high probability to a narrow family of near-constant functions.<n>We further show that this collapse is not unavoidable: tensor-structured VQCs, including tensor-network-based and tensor-hypernetwork parameterizations, lie outside the Haar-like universality class.
arXiv Detail & Related papers (2026-01-05T08:04:33Z) - Manifold Percolation: from generative model to Reinforce learning [0.26905021039717986]
Generative modeling is typically framed as learning mapping rules, but from an observer's perspective without access to these rules, the task becomes disentangling the geometric support from the probability distribution.<n>We propose that continuum percolation is uniquely suited to this support analysis, as the sampling process effectively projects high-dimensional density estimation onto a geometric counting problem on the support.
arXiv Detail & Related papers (2025-11-25T17:12:42Z) - Structured Basis Function Networks: Loss-Centric Multi-Hypothesis Ensembles with Controllable Diversity [46.60221265861393]
Existing approaches to predictive uncertainty rely on multi-hypothesis prediction, which promotes diversity but lacks principled aggregation.<n>The Structured Basis Function Network addresses this gap by linking multi-hypothesis prediction and ensembling through centroidal aggregation induced by Bregman divergences.<n>A tunable diversity mechanism provides parametric control of the bias-variance-diversity trade-off, connecting multi-hypothesis generalisation with loss-aware ensemble aggregation.
arXiv Detail & Related papers (2025-09-02T19:53:43Z) - Ultracoarse Equilibria and Ordinal-Folding Dynamics in Operator-Algebraic Models of Infinite Multi-Agent Games [0.0]
We develop an operator algebraic framework for infinite games with a continuum of agents.<n>We prove that regret based learning dynamics governed by a noncommutative continuity equation converge to a unique quantal response equilibrium.<n>We introduce the ordinal folding index, a computable ordinal valued metric that measures the self referential depth of the dynamics.
arXiv Detail & Related papers (2025-07-25T22:20:42Z) - Spin-only dynamics of the multi-species nonreciprocal Dicke model [0.0]
Hepp-Lieb-Dicke model is ubiquitous in cavity quantum electrodynamics.<n>We study a variation of the open Dicke model which realizes mediated nonreciprocal interactions between spin species.<n>We find signatures of phase transitions even for small system sizes.
arXiv Detail & Related papers (2025-07-10T17:41:46Z) - Intrinsic Regularization via Curved Momentum Space: A Geometric Solution to Divergences in Quantum Field Theory [0.0]
UV divergences in quantum field theory (QFT) have long been a fundamental challenge.<n>We propose a novel and self-consistent approach in which UV regularization emerges naturally from the curved geometry of momentum space.<n>We show seamless extension to Minkowski space, maintainining regularization properties in relativistic QFT.
arXiv Detail & Related papers (2025-02-20T10:49:26Z) - Self-congruent point in critical matrix product states: An effective field theory for finite-entanglement scaling [6.496379207200742]
We show that the finite MPS bond dimension $chi$ is equivalent to introducing a perturbation by a relevant operator to the fixed-point Hamiltonian.<n>This phenomenon defines a renormalization group self-congruent point, where the relevant coupling constant ceases to flow due to a balance of two effects.
arXiv Detail & Related papers (2024-11-06T14:35:09Z) - Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse
Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization.
We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors.
We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z) - Simultaneous Transport Evolution for Minimax Equilibria on Measures [48.82838283786807]
Min-max optimization problems arise in several key machine learning setups, including adversarial learning and generative modeling.
In this work we focus instead in finding mixed equilibria, and consider the associated lifted problem in the space of probability measures.
By adding entropic regularization, our main result establishes global convergence towards the global equilibrium.
arXiv Detail & Related papers (2022-02-14T02:23:16Z) - Geometric phase in a dissipative Jaynes-Cummings model: theoretical
explanation for resonance robustness [68.8204255655161]
We compute the geometric phases acquired in both unitary and dissipative Jaynes-Cummings models.
In the dissipative model, the non-unitary effects arise from the outflow of photons through the cavity walls.
We show the geometric phase is robust, exhibiting a vanishing correction under a non-unitary evolution.
arXiv Detail & Related papers (2021-10-27T15:27:54Z) - A Unifying and Canonical Description of Measure-Preserving Diffusions [60.59592461429012]
A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework.
We develop a geometric theory that improves and generalises this construction to any manifold.
arXiv Detail & Related papers (2021-05-06T17:36:55Z) - Localisation in quasiperiodic chains: a theory based on convergence of
local propagators [68.8204255655161]
We present a theory of localisation in quasiperiodic chains with nearest-neighbour hoppings, based on the convergence of local propagators.
Analysing the convergence of these continued fractions, localisation or its absence can be determined, yielding in turn the critical points and mobility edges.
Results are exemplified by analysing the theory for three quasiperiodic models covering a range of behaviour.
arXiv Detail & Related papers (2021-02-18T16:19:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.