Hyperbolic Large Language Models
- URL: http://arxiv.org/abs/2509.05757v1
- Date: Sat, 06 Sep 2025 15:56:46 GMT
- Title: Hyperbolic Large Language Models
- Authors: Sarang Patil, Zeyong Zhang, Yiran Huang, Tengfei Ma, Mengjia Xu,
- Abstract summary: Large language models (LLMs) have achieved remarkable success and demonstrated superior performance across various tasks.<n>However, many real-world data exhibit highly non-Euclidean latent hierarchical anatomy, such as protein networks, transportation networks, financial networks, brain networks, and linguistic structures or syntactic trees in natural languages.<n>We provide a comprehensive and contextual exposition of recent advancements in LLMs that leverage hyperbolic geometry as a representation space to enhance semantic representation learning and multi-scale reasoning.
- Score: 7.483401973996036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved remarkable success and demonstrated superior performance across various tasks, including natural language processing (NLP), weather forecasting, biological protein folding, text generation, and solving mathematical problems. However, many real-world data exhibit highly non-Euclidean latent hierarchical anatomy, such as protein networks, transportation networks, financial networks, brain networks, and linguistic structures or syntactic trees in natural languages. Effectively learning intrinsic semantic entailment and hierarchical relationships from these raw, unstructured input data using LLMs remains an underexplored area. Due to its effectiveness in modeling tree-like hierarchical structures, hyperbolic geometry -- a non-Euclidean space -- has rapidly gained popularity as an expressive latent representation space for complex data modeling across domains such as graphs, images, languages, and multi-modal data. Here, we provide a comprehensive and contextual exposition of recent advancements in LLMs that leverage hyperbolic geometry as a representation space to enhance semantic representation learning and multi-scale reasoning. Specifically, the paper presents a taxonomy of the principal techniques of Hyperbolic LLMs (HypLLMs) in terms of four main categories: (1) hyperbolic LLMs through exp/log maps; (2) hyperbolic fine-tuned models; (3) fully hyperbolic LLMs, and (4) hyperbolic state-space models. We also explore crucial potential applications and outline future research directions. A repository of key papers, models, datasets, and code implementations is available at https://github.com/sarangp2402/Hyperbolic-LLM-Models/tree/main.
Related papers
- Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space [24.868649493405528]
Knowledge Tracing (KT) diagnoses students' concept mastery through continuous learning state monitoring in education.<n>Existing methods rely on ID-based sequences or shallow textual features.<n>This paper proposes a Large Language Model Hyperbolic Aligned Knowledge Tracing framework.
arXiv Detail & Related papers (2026-02-26T11:17:31Z) - Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling [68.14113731953971]
This paper introduces MILO, an Implicit spatIaL wOrld modeling paradigm that simulates human-like imagination.<n>We show that our approach significantly enhances spatial reasoning capabilities across multiple baselines and benchmarks.
arXiv Detail & Related papers (2025-12-01T16:01:41Z) - HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models [50.31704374968706]
Multi-modal large language models (MLLMs) have emerged as a transformative approach for aligning visual and textual understanding.<n>They typically require extremely high computational resources for training to achieve cross-modal alignment at multi-granularity levels.<n>We argue that a key source of this inefficiency lies in the vision encoders they widely equip with, e.g., CLIP and SAM, which lack the alignment with language at multi-granularity levels.
arXiv Detail & Related papers (2025-10-23T08:16:44Z) - Language Models as Ontology Encoders [32.148744398729896]
Ontology embeddings can infer plausible new knowledge and approximate complex reasoning.<n>OnT tunes a Pretrained Model Language (PLM) via incorporating hyperbolic modeling in a geometric space.<n>OnT consistently outperforms the baselines in both tasks of prediction and inference.
arXiv Detail & Related papers (2025-07-18T19:26:16Z) - HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts [23.011684464345294]
We introduce HELM, a family of HypErbolic Large Language Models.<n>For HELM-MICE, we develop hyperbolic Multi-Head Latent Attention.<n>For both models, we develop essential hyperbolic equivalents of rotary positional encodings and RMS normalization.
arXiv Detail & Related papers (2025-05-30T15:42:42Z) - Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings [1.4183971140167244]
We propose Hierarchical Mamba (HiM) to learn hierarchy-aware language embeddings.<n>HiM integrates efficient Mamba2 with exponential growth and curved nature of hyperbolic geometry.<n>We show that both HiM models effectively capture hierarchical relationships for four ontological datasets.
arXiv Detail & Related papers (2025-05-25T04:45:06Z) - Large Concept Models: Language Modeling in a Sentence Representation Space [62.73366944266477]
We present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a concept.<n> Concepts are language- and modality-agnostic and represent a higher level idea or action in a flow.<n>We show that our model exhibits impressive zero-shot generalization performance to many languages.
arXiv Detail & Related papers (2024-12-11T23:36:20Z) - GrootVL: Tree Topology is All You Need in State Space Model [66.36757400689281]
GrootVL is a versatile multimodal framework that can be applied to both visual and textual tasks.
Our method significantly outperforms existing structured state space models on image classification, object detection and segmentation.
By fine-tuning large language models, our approach achieves consistent improvements in multiple textual tasks at minor training cost.
arXiv Detail & Related papers (2024-06-04T15:09:29Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - Language is All a Graph Needs [33.9836278881785]
We propose InstructGLM (Instruction-finetuned Graph Language Model) with highly scalable prompts based on natural language instructions.
Our method surpasses all GNN baselines on ogbn-arxiv, Cora and PubMed datasets.
arXiv Detail & Related papers (2023-08-14T13:41:09Z) - Hyperbolic Graph Neural Networks: A Review of Methods and Applications [61.49208407567829]
This survey paper provides a comprehensive review of the rapidly evolving field of Hyperbolic Graph Learning (HGL)<n>We systematically categorize and analyze existing methods dividing them into (1) hyperbolic graph embedding-based techniques, (2) graph neural network-based hyperbolic models, and (3) emerging paradigms.<n>We extensively discuss diverse applications of HGL across multiple domains, including recommender systems, knowledge graphs, bioinformatics, and other relevant scenarios.
arXiv Detail & Related papers (2022-02-28T15:08:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.