Read As Human: Compressing Context via Parallelizable Close Reading and Skimming
- URL: http://arxiv.org/abs/2602.01840v1
- Date: Mon, 02 Feb 2026 09:10:56 GMT
- Title: Read As Human: Compressing Context via Parallelizable Close Reading and Skimming
- Authors: Jiwei Tang, Shilei Liu, Zhicheng Zhang, Qingsong Lv, Runsong Zhao, Tingwei Lu, Langming Liu, Haibin Chen, Yujin Yuan, Hai-Tao Zheng, Wenbo Su, Bo Zheng,
- Abstract summary: RAM (Read As HuMan) is a context compression framework that adopts an adaptive hybrid reading strategy.<n>Inspired by human reading behavior, RAM partitions the context into segments and encodes them with the input query in parallel.<n> Experiments demonstrate that RAM outperforms existing baselines on multiple question answering and summarization benchmarks.
- Score: 34.83776292069694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) demonstrate exceptional capability across diverse tasks. However, their deployment in long-context scenarios is hindered by two challenges: computational inefficiency and redundant information. We propose RAM (Read As HuMan), a context compression framework that adopts an adaptive hybrid reading strategy, to address these challenges. Inspired by human reading behavior (i.e., close reading important content while skimming less relevant content), RAM partitions the context into segments and encodes them with the input query in parallel. High-relevance segments are fully retained (close reading), while low-relevance ones are query-guided compressed into compact summary vectors (skimming). Both explicit textual segments and implicit summary vectors are concatenated and fed into decoder to achieve both superior performance and natural language format interpretability. To refine the decision boundary between close reading and skimming, we further introduce a contrastive learning objective based on positive and negative query-segment pairs. Experiments demonstrate that RAM outperforms existing baselines on multiple question answering and summarization benchmarks across two backbones, while delivering up to a 12x end-to-end speedup on long inputs (average length 16K; maximum length 32K).
Related papers
- Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning [47.87361916374891]
We propose a framework for efficient long-context inference based on chunk-wise compression and selective memory recall.<n>The framework segments long inputs into chunks and encodes each chunk into compressed memory representations using a learned compressor.<n>It achieves up to a 2 times reduction in peak GPU memory usage and a 6 times inference speedup over MemAgent.
arXiv Detail & Related papers (2026-02-09T08:33:11Z) - APCE: Adaptive Progressive Context Expansion for Long Context Processing [0.5274824616260646]
We propose APCE as a context-aware solution to select the most important input chunks for processing.<n>By directly operating on the input, APCE decouples from strict dependency on underlying hardware or scalable environments.<n>Our empirical evaluations have demonstrated superior or on-par summarization performance for APCE compared to the full dense baseline.
arXiv Detail & Related papers (2025-10-14T01:26:36Z) - Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language Models [1.5817866616624976]
Large language models (LLMs) often struggle to accurately read and comprehend long texts.<n>Current methods for improvement typically rely on splitting long contexts into fixed-length chunks.<n>We propose a straightforward approach for dynamically separating and selecting chunks of long context.
arXiv Detail & Related papers (2025-06-01T01:42:40Z) - Context-Aware Hierarchical Merging for Long Document Summarization [56.96619074316232]
We propose different approaches to enrich hierarchical merging with context from the source document.<n> Experimental results on datasets representing legal and narrative domains show that contextual augmentation consistently outperforms zero-shot and hierarchical merging baselines.
arXiv Detail & Related papers (2025-02-03T01:14:31Z) - ContextDet: Temporal Action Detection with Adaptive Context Aggregation [47.84334557998388]
We introduce a single-stage ContextDet framework for temporal action detection (TAD)
Our model features a pyramid adaptive context aggragation (ACA) architecture, capturing long context and improving action discriminability.
By varying the length of these large kernels across the ACA pyramid, our model provides lightweight yet effective context aggregation and action discrimination.
arXiv Detail & Related papers (2024-10-20T04:28:19Z) - Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception [10.614437503578856]
This paper proposes the Meta-Chunking framework, which specifically enhances chunking quality.<n>We design two adaptive chunking techniques based on uncertainty, namely Perplexity Chunking and Margin Sampling Chunking.<n>We establish the global information compensation mechanism, encompassing a two-stage hierarchical summary generation process and a three-stage text chunk rewriting procedure.
arXiv Detail & Related papers (2024-10-16T17:59:32Z) - KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches [52.02764371205856]
Long context capability is a crucial competency for large language models (LLMs)
This work provides a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks.
arXiv Detail & Related papers (2024-07-01T17:59:47Z) - Walking Down the Memory Maze: Beyond Context Limit through Interactive
Reading [63.93888816206071]
We introduce MemWalker, a method that processes the long context into a tree of summary nodes. Upon receiving a query, the model navigates this tree in search of relevant information, and responds once it gathers sufficient information.
We show that, beyond effective reading, MemWalker enhances explainability by highlighting the reasoning steps as it interactively reads the text; pinpointing the relevant text segments related to the query.
arXiv Detail & Related papers (2023-10-08T06:18:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.