HexFormer: Hyperbolic Vision Transformer with Exponential Map Aggregation
- URL: http://arxiv.org/abs/2601.19849v1
- Date: Tue, 27 Jan 2026 17:56:49 GMT
- Title: HexFormer: Hyperbolic Vision Transformer with Exponential Map Aggregation
- Authors: Haya Alyoussef, Ahmad Bdeir, Diego Coello de Portugal Mecke, Tom Hanika, Niels Landwehr, Lars Schmidt-Thieme,
- Abstract summary: Hyperbolic geometry provides a natural framework for representing hierarchical and relational structures.<n>HexFormer is a hyperbolic vision transformer for image classification that incorporates exponential map aggregation.<n>HexFormer incorporates a novel attention mechanism based on exponential map aggregation, which yields more accurate and stable aggregated representations.
- Score: 12.198535149754058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data across modalities such as images, text, and graphs often contains hierarchical and relational structures, which are challenging to model within Euclidean geometry. Hyperbolic geometry provides a natural framework for representing such structures. Building on this property, this work introduces HexFormer, a hyperbolic vision transformer for image classification that incorporates exponential map aggregation within its attention mechanism. Two designs are explored: a hyperbolic ViT (HexFormer) and a hybrid variant (HexFormer-Hybrid) that combines a hyperbolic encoder with an Euclidean linear classification head. HexFormer incorporates a novel attention mechanism based on exponential map aggregation, which yields more accurate and stable aggregated representations than standard centroid based averaging, showing that simpler approaches retain competitive merit. Experiments across multiple datasets demonstrate consistent performance improvements over Euclidean baselines and prior hyperbolic ViTs, with the hybrid variant achieving the strongest overall results. Additionally, this study provides an analysis of gradient stability in hyperbolic transformers. The results reveal that hyperbolic models exhibit more stable gradients and reduced sensitivity to warmup strategies compared to Euclidean architectures, highlighting their robustness and efficiency in training. Overall, these findings indicate that hyperbolic geometry can enhance vision transformer architectures by improving gradient stability and accuracy. In addition, relatively simple mechanisms such as exponential map aggregation can provide strong practical benefits.
Related papers
- HyperAlign: Hyperbolic Entailment Cones for Adaptive Text-to-Image Alignment Assessment [84.65251073657883]
We propose HyperAlign, an adaptive text-to-image alignment assessment framework based on hyperbolic entailment geometry.<n>First, we extract Euclidean features using CLIP and map them to hyperbolic space.<n>Second, we design a dynamic-supervision entailment modeling mechanism that transforms discrete entailment logic into continuous geometric structure supervision.<n>Third, we propose an adaptive modulation regressor that utilizes hyperbolic geometric features to generate sample-level modulation parameters.
arXiv Detail & Related papers (2026-01-08T05:41:06Z) - HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space [1.1858475445768824]
This paper introduces the Hyperbolic Vision Transformer (HVT), a novel extension of the Vision Transformer (ViT) that integrates hyperbolic geometry.
While traditional ViTs operate in Euclidean space, our method enhances the self-attention mechanism by leveraging hyperbolic distance and M"obius transformations.
We present rigorous mathematical formulations, showing how hyperbolic geometry can be incorporated into attention layers, feed-forward networks, and optimization.
arXiv Detail & Related papers (2024-09-25T13:07:37Z) - Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space [61.82234368639889]
We introduce Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry.<n>We develop a linear self-attention mechanism in hyperbolic space, enabling hyperbolic Transformer to process billion-scale graph data and long-sequence inputs for the first time.
arXiv Detail & Related papers (2024-07-01T13:44:38Z) - Hyperbolic Heterogeneous Graph Attention Networks [3.0165549581582454]
Most previous heterogeneous graph embedding models represent elements in a heterogeneous graph as vector representations in a low-dimensional Euclidean space.
We propose Hyperbolic Heterogeneous Graph Attention Networks (HHGAT) that learn vector representations in hyperbolic spaces with meta-path instances.
We conducted experiments on three real-world heterogeneous graph datasets, demonstrating that HHGAT outperforms state-of-the-art heterogeneous graph embedding models in node classification and clustering tasks.
arXiv Detail & Related papers (2024-04-15T04:45:49Z) - Hyperbolic Delaunay Geometric Alignment [52.835250875177756]
We propose a similarity score for comparing datasets in a hyperbolic space.
The core idea is counting the edges of the hyperbolic Delaunay graph connecting datapoints across the given sets.
We provide an empirical investigation on synthetic and real-life biological data and demonstrate that HyperDGA outperforms the hyperbolic version of classical distances between sets.
arXiv Detail & Related papers (2024-04-12T17:14:58Z) - Curve Your Attention: Mixed-Curvature Transformers for Graph
Representation Learning [77.1421343649344]
We propose a generalization of Transformers towards operating entirely on the product of constant curvature spaces.
We also provide a kernelized approach to non-Euclidean attention, which enables our model to run in time and memory cost linear to the number of nodes and edges.
arXiv Detail & Related papers (2023-09-08T02:44:37Z) - Lorentz Equivariant Model for Knowledge-Enhanced Hyperbolic
Collaborative Filtering [19.57064597050846]
We introduce prior auxiliary information from the knowledge graph (KG) to assist the user-item graph.
We propose a rigorously Lorentz group equivariant knowledge-enhanced collaborative filtering model (LECF)
We show that LECF remarkably outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-09T10:20:23Z) - Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier
Transform [29.205221688430733]
The choice of geometric space for knowledge graph (KG) embeddings can have significant effects on the performance of KG completion tasks.
Recent explorations of the complex hyperbolic geometry further improved the hyperbolic embeddings for capturing a variety of hierarchical structures.
This paper aims to utilize the representation capacity of the complex hyperbolic geometry in multi-relational KG embeddings.
arXiv Detail & Related papers (2022-11-07T15:46:00Z) - Geometry Contrastive Learning on Heterogeneous Graphs [50.58523799455101]
This paper proposes a novel self-supervised learning method, termed as Geometry Contrastive Learning (GCL)
GCL views a heterogeneous graph from Euclidean and hyperbolic perspective simultaneously, aiming to make a strong merger of the ability of modeling rich semantics and complex structures.
Extensive experiments on four benchmarks data sets show that the proposed approach outperforms the strong baselines.
arXiv Detail & Related papers (2022-06-25T03:54:53Z) - Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational
Inference [48.63194907060615]
We build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation.
We incorporate hyperbolic geometry in the latent space through a Poincare embedding to efficiently represent graphs exhibiting hierarchical structure.
arXiv Detail & Related papers (2020-10-31T05:48:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.