AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
- URL: http://arxiv.org/abs/2512.04550v1
- Date: Thu, 04 Dec 2025 08:04:19 GMT
- Title: AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
- Authors: Yangning Li, Shaoshen Chen, Yinghui Li, Yankai Chen, Hai-Tao Zheng, Hui Wang, Wenhao Jiang, Philip S. Yu,
- Abstract summary: We propose AdmTree, a novel framework for adaptive, hierarchical context compression.<n>AdmTree segments input based on information density, utilizing gist tokens to summarize variable-length segments as the leaves of a semantic binary tree.<n>By preserving fine-grained details alongside global semantic coherence, mitigating positional bias, and dynamically adapting to content, AdmTree robustly retains the semantic information of long contexts.
- Score: 66.39371821756649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The quadratic complexity of self-attention constrains Large Language Models (LLMs) in processing long contexts, a capability essential for many advanced applications. Context compression aims to alleviate this computational bottleneck while retaining critical semantic information. However, existing approaches often fall short: explicit methods may compromise local detail, whereas implicit methods can suffer from positional biases, information degradation, or an inability to capture long-range semantic dependencies. We propose AdmTree, a novel framework for adaptive, hierarchical context compression with a central focus on preserving high semantic fidelity while maintaining efficiency. AdmTree dynamically segments input based on information density, utilizing gist tokens to summarize variable-length segments as the leaves of a semantic binary tree. This structure, together with a lightweight aggregation mechanism and a frozen backbone LLM (thereby minimizing new trainable parameters), enables efficient hierarchical abstraction of the context. By preserving fine-grained details alongside global semantic coherence, mitigating positional bias, and dynamically adapting to content, AdmTree robustly retains the semantic information of long contexts.
Related papers
- Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression [55.51959317490934]
Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding.<n>We argue that graphs inherently contain rich structural and semantic information, and that their effective exploitation can unlock potential gains in LLMs reasoning performance.<n>We propose Homophily-aware Structural and Semantic Compression for LLMs (HS2C), a framework centered on exploiting graph homophily.
arXiv Detail & Related papers (2026-01-13T03:35:18Z) - Semantic Tree Inference on Text Corpa using a Nested Density Approach together with Large Language Model Embeddings [0.0]
We propose a nested density clustering approach to infer hierarchical trees of semantically related texts.<n>By embedding dense clusters into increasingly diffuse ones, we construct a tree structure that captures hierarchical semantic relationships among texts.
arXiv Detail & Related papers (2025-12-29T13:55:23Z) - Toward Multi-Session Personalized Conversation: A Large-Scale Dataset and Hierarchical Tree Framework for Implicit Reasoning [30.54506564763053]
We introduce ImplexConv, a large-scale long-term dataset with 2,500 examples, each containing approximately 100 conversation sessions.<n>We also propose TaciTree, a novel hierarchical tree framework that structures conversation history into multiple levels of summarization.
arXiv Detail & Related papers (2025-03-10T07:59:41Z) - ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels.<n>Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches.<n>Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z) - From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs [9.822315423251395]
We introduce MemTree, an algorithm that leverages a dynamic, tree-structured memory representation to optimize the organization, retrieval, and integration of information.<n>Our algorithm dynamically adapts this memory structure by computing and comparing semantic embeddings of new and existing information to enrich the model's context-awareness.
arXiv Detail & Related papers (2024-10-17T21:47:11Z) - Walking Down the Memory Maze: Beyond Context Limit through Interactive
Reading [63.93888816206071]
We introduce MemWalker, a method that processes the long context into a tree of summary nodes. Upon receiving a query, the model navigates this tree in search of relevant information, and responds once it gathers sufficient information.
We show that, beyond effective reading, MemWalker enhances explainability by highlighting the reasoning steps as it interactively reads the text; pinpointing the relevant text segments related to the query.
arXiv Detail & Related papers (2023-10-08T06:18:14Z) - Conversational Semantic Parsing using Dynamic Context Graphs [68.72121830563906]
We consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types.
We focus on models which are capable of interactively mapping user utterances into executable logical forms.
arXiv Detail & Related papers (2023-05-04T16:04:41Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Embedding Semantic Hierarchy in Discrete Optimal Transport for Risk
Minimization [26.929277114533498]
We propose to incorporate the risk-aware inter-class correlation in a discrete optimal transport (DOT) training framework.
Specifically, we define the tree induced error (TIE) on a hierarchical semantic tree and extend it to its increasing function.
We achieve promising results on several large scale image classification tasks with a semantic tree structure in a plug and play manner.
arXiv Detail & Related papers (2021-04-30T21:47:36Z) - Structural Information Learning Machinery: Learning from Observing,
Associating, Optimizing, Decoding, and Abstracting [0.913755431537592]
We propose the model of it structural information learning machines (SiLeM)
A SiLeM machine learns the laws or rules of nature.
arXiv Detail & Related papers (2020-01-27T09:14:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.