Related papers: CAT: Curvature-Adaptive Transformers for Geometry-Aware Learning

CAT: Curvature-Adaptive Transformers for Geometry-Aware Learning

URL: http://arxiv.org/abs/2510.01634v1
Date: Thu, 02 Oct 2025 03:26:33 GMT
Title: CAT: Curvature-Adaptive Transformers for Geometry-Aware Learning
Authors: Ryan Y. Lin, Siddhartha Ojha, Nicholas Bai,
Abstract summary: Curvature-Adaptive Transformer (CAT) learns per-token routing across three geometric attention branches through a lightweight, differentiable gating mechanism.<n>On knowledge graph completion benchmarks, CAT achieves approximately 10% improvements in MRR and Hits@10 over fixed-geometry baselines with minimal overhead.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformers achieve strong performance across diverse domains but implicitly assume Euclidean geometry in their attention mechanisms, limiting their effectiveness on data with non-Euclidean structure. While recent extensions to hyperbolic and spherical spaces show promise for hierarchical and cyclical patterns, respectively, they require committing to a single geometry a priori, reducing flexibility when data exhibits mixed geometric properties. We introduce the Curvature-Adaptive Transformer (CAT), a novel architecture that dynamically learns per-token routing across three geometric attention branches through a lightweight, differentiable gating mechanism. Unlike fixed-geometry approaches, CAT enables adaptive geometric specialization, routing tokens to the appropriate curvature based on their local relational structure. The routing network provides interpretable curvature preferences while each branch employs geometry-specific operations optimized for its respective manifold. On knowledge graph completion benchmarks (FB15k-237, WN18RR), CAT achieves approximately 10% improvements in MRR and Hits@10 over fixed-geometry baselines with minimal overhead (5% parameter increase, comparable inference time). These results demonstrate that learned geometric adaptation outperforms any single fixed geometry for complex relational reasoning, establishing CAT as a scalable and interpretable foundation for mixture-of-geometry architectures across language, vision, and multimodal domains.

Related papers

Parameter-Efficient Fine-Tuning of LLMs with Mixture of Space Experts [20.82313207866023]
We propose a unified framework that leverages multiple geometric spaces simultaneously to learn curvature-aware representations.<n>We develop MoSLoRA, which extends Low-Rank Adaptation (LoRA) with heterogeneous geometric experts.<n>Our experiments across diverse benchmarks demonstrate that MoSLoRA consistently outperforms strong baselines.
arXiv Detail & Related papers (2026-02-16T06:07:32Z)
ArGEnT: Arbitrary Geometry-encoded Transformer for Operator Learning [2.757490632589873]
We propose Arbitrary Geometry-encoded Transformer (ArGEnT), a geometry-aware attention-based architecture for operator learning on arbitrary domains.<n>By combining flexible geometry encoding with operator-learning capabilities, ArGEnT provides a scalable surrogate modeling framework for optimization, uncertainty, and data-driven modeling of complex physical systems.
arXiv Detail & Related papers (2026-02-12T06:22:59Z)
Learning Geometry: A Framework for Building Adaptive Manifold Models through Metric Optimization [8.201374511929538]
This paper proposes a novel paradigm for machine learning that moves beyond traditional parameter optimization.<n>We optimize the metric tensor field on a manifold with a predefined topology, thereby dynamically shaping the geometric structure of the model space.<n>This work lays a solid foundation for constructing fully dynamic "meta-learners" capable of autonomously evolving their geometry and topology.
arXiv Detail & Related papers (2025-10-30T01:53:32Z)
The Neural Differential Manifold: An Architecture with Explicit Geometric Structure [8.201374511929538]
This paper introduces the Neural Differential Manifold (NDM), a novel neural network architecture that explicitly incorporates geometric structure into its fundamental design.<n>We analyze the theoretical advantages of this approach, including its potential for more efficient optimization, enhanced continual learning, and applications in scientific discovery and controllable generative modeling.
arXiv Detail & Related papers (2025-10-29T02:24:27Z)
Geometry-Aware Spiking Graph Neural Network [24.920334588995072]
We propose a Geometry-Aware Spiking Graph Neural Network that unifies spike-based neural dynamics with adaptive representation learning.<n>Experiments on multiple benchmarks show that GSG achieves superior accuracy, robustness, and energy efficiency compared to both Euclidean SNNs and manifold-based GNNs.
arXiv Detail & Related papers (2025-08-09T02:52:38Z)
Adaptive Riemannian Graph Neural Networks [29.859977834688625]
We introduce a novel framework that learns a continuous and anisotropic metric tensor field over the graph.<n>It allows each node to determine its optimal local geometry, enabling the model to fluidly adapt to the graph's structural landscape.<n>Our method demonstrates superior performance on both homophilic and heterophilic benchmark geometries.
arXiv Detail & Related papers (2025-08-04T16:55:02Z)
Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation [79.27003481818413]
We introduce FlatVI, a training framework that regularises the latent manifold of discrete-likelihood variational autoencoders towards Euclidean geometry.<n>By encouraging straight lines in the latent space to approximate geodesics on the decoded single-cell manifold, FlatVI enhances compatibility with downstream approaches.
arXiv Detail & Related papers (2025-07-15T23:08:14Z)
Fully Geometric Multi-Hop Reasoning on Knowledge Graphs with Transitive Relations [50.05281461410368]
We introduce GeometrE, a geometric embedding method for multi-hop reasoning.<n>It does not require learning the logical operations and enables full geometric interpretability.<n>Our experiments show that GeometrE outperforms current state-of-the-art methods on standard benchmark datasets.
arXiv Detail & Related papers (2025-05-18T11:17:50Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains.<n>We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing [15.687514300950813]
We present a framework based on local reference frames ("local canonicalization") which can be integrated with any architecture without restrictions.<n>Our framework applies to message passing on geometric data in Euclidean spaces of arbitrary dimension.<n>We demonstrate the superiority of tensorial messages and achieve state-of-the-art results on normal vector regression and competitive results on other standard 3D point cloud tasks.
arXiv Detail & Related papers (2024-05-24T09:41:06Z)
Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images [56.86175251327466]
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context. Our approach extracts geometric context that encodes the geometric variations present in the input image and correlates depth estimation with geometric constraints. Our method unifies depth and surface normal estimations within a cohesive framework, which enables the generation of high-quality 3D geometry from images.
arXiv Detail & Related papers (2024-02-08T17:57:59Z)
Exploring Data Geometry for Continual Learning [64.4358878435983]
We study continual learning from a novel perspective by exploring data geometry for the non-stationary stream of data. Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data. Experiments show that our method achieves better performance than baseline methods designed in Euclidean space.
arXiv Detail & Related papers (2023-04-08T06:35:25Z)
Frame Averaging for Equivariant Shape Space Learning [85.42901997467754]
A natural way to incorporate symmetries in shape space learning is to ask that the mapping to the shape space (encoder) and mapping from the shape space (decoder) are equivariant to the relevant symmetries. We present a framework for incorporating equivariance in encoders and decoders by introducing two contributions.
arXiv Detail & Related papers (2021-12-03T06:41:19Z)
Hermitian Symmetric Spaces for Graph Embeddings [0.0]
We learn continuous representations of graphs in spaces of symmetric matrices over C. These spaces offer a rich geometry that simultaneously admits hyperbolic and Euclidean subspaces. The proposed models are able to automatically adapt to very dissimilar arrangements without any apriori estimates of graph features.
arXiv Detail & Related papers (2021-05-11T18:14:52Z)
Inter-layer Transition in Neural Architecture Search [89.00449751022771]
The dependency between the architecture weights of connected edges is explicitly modeled in this paper. Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T03:33:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.