CompAct: Compressing Retrieved Documents Actively for Question Answering
- URL: http://arxiv.org/abs/2407.09014v3
- Date: Mon, 14 Oct 2024 12:42:54 GMT
- Title: CompAct: Compressing Retrieved Documents Actively for Question Answering
- Authors: Chanwoong Yoon, Taewhoo Lee, Hyeon Hwang, Minbyul Jeong, Jaewoo Kang,
- Abstract summary: CompAct is a novel framework that employs an active strategy to condense extensive documents without losing key information.
Our experiments demonstrate that CompAct brings significant improvements in both performance and compression rate on multi-hop question-answering benchmarks.
- Score: 15.585833125854418
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-augmented generation supports language models to strengthen their factual groundings by providing external contexts. However, language models often face challenges when given extensive information, diminishing their effectiveness in solving questions. Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios where crucial information cannot be captured with a single-step approach. To overcome this limitation, we introduce CompAct, a novel framework that employs an active strategy to condense extensive documents without losing key information. Our experiments demonstrate that CompAct brings significant improvements in both performance and compression rate on multi-hop question-answering benchmarks. CompAct flexibly operates as a cost-efficient plug-in module with various off-the-shelf retrievers or readers, achieving exceptionally high compression rates (47x).
Related papers
- Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor [16.830389144259584]
Task-agnostic Prompt Compression (TPC) is a novel framework that generalizes compression across tasks and domains without requiring input questions or templates.
TPC generates a context-relevant task description using a task descriptor trained on a curated dataset of context and query pairs.
We introduce 3 model sizes (Base, Large, and Huge), where the largest model outperforms the existing state-of-the-art methods on LongBench and ZeroSCROLLS benchmarks.
arXiv Detail & Related papers (2025-02-19T02:16:29Z) - EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation [8.757777529568383]
Current RAG systems often struggle when retrieval models fail to rank the most relevant documents.
We introduce EXIT, an extractive context compression framework.
Our evaluations show that EXIT consistently surpasses existing compression methods.
arXiv Detail & Related papers (2024-12-17T05:38:27Z) - BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
Retrieval-augmented generation (RAG) can supplement large language models (LLMs) by integrating external knowledge.
This paper presents BRIEF, a lightweight approach that performs query-aware multi-hop reasoning.
Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability [67.77534983324229]
In this paper, we investigate the ability of Large Language Models to develop a unified compression method that discretizes uninformative tokens.
Experiments show Selection-p achieves state-of-the-art performance across numerous classification tasks.
It exhibits superior transferability to different models compared to prior work.
arXiv Detail & Related papers (2024-10-15T17:05:25Z) - Perception Compressor: A Training-Free Prompt Compression Framework in Long Context Scenarios [17.720102137585503]
Perception is a training-free prompt compression framework for large language models.
It includes a perception retriever that leverages guiding questions and instruction to retrieve the most relevant demonstrations.
We conduct extensive experiments on long context, benchmarks, iSie, LongBench, and MuSiQue.
arXiv Detail & Related papers (2024-09-28T07:13:33Z) - QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory [66.01597794579568]
We introduce information bottleneck theory (IB) to model the problem.
We propose a cross-attention-based approach to approximate mutual information in IB.
Our method achieves a 25% increase in compression rate compared to the state-of-the-art.
arXiv Detail & Related papers (2024-08-20T02:44:45Z) - Concise and Precise Context Compression for Tool-Using Language Models [60.606281074373136]
We propose two strategies for compressing tool documentation into concise and precise summary sequences for tool-using language models.
Results on API-Bank and APIBench show that our approach reaches a performance comparable to the upper-bound baseline under up to 16x compression ratio.
arXiv Detail & Related papers (2024-07-02T08:17:00Z) - KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches [52.02764371205856]
Long context capability is a crucial competency for large language models (LLMs)
This work provides a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks.
arXiv Detail & Related papers (2024-07-01T17:59:47Z) - Extending Context Window of Large Language Models via Semantic
Compression [21.35020344956721]
Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses.
We propose a novel semantic compression method that enables generalization to texts 6-8 times longer, without incurring significant computational costs or requiring fine-tuning.
arXiv Detail & Related papers (2023-12-15T07:04:33Z) - RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective
Augmentation [61.53695868960846]
We propose compressing retrieved documents into textual summaries prior to in-context integration.
This not only reduces the computational costs but also relieves the burden of LMs to identify relevant information in long retrieved documents.
We show that our compressors trained for one LM can transfer to other LMs on the language modeling task and provide summaries largely faithful to the retrieved documents.
arXiv Detail & Related papers (2023-10-06T17:55:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.