Grounding Language Model with Chunking-Free In-Context Retrieval
- URL: http://arxiv.org/abs/2402.09760v1
- Date: Thu, 15 Feb 2024 07:22:04 GMT
- Title: Grounding Language Model with Chunking-Free In-Context Retrieval
- Authors: Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou
- Abstract summary: This paper presents a novel Chunking-Free In-Context (CFIC) retrieval approach, specifically tailored for Retrieval-Augmented Generation (RAG) systems.
- Score: 27.316315081648572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel Chunking-Free In-Context (CFIC) retrieval
approach, specifically tailored for Retrieval-Augmented Generation (RAG)
systems. Traditional RAG systems often struggle with grounding responses using
precise evidence text due to the challenges of processing lengthy documents and
filtering out irrelevant content. Commonly employed solutions, such as document
chunking and adapting language models to handle longer contexts, have their
limitations. These methods either disrupt the semantic coherence of the text or
fail to effectively address the issues of noise and inaccuracy in evidence
retrieval.
CFIC addresses these challenges by circumventing the conventional chunking
process. It utilizes the encoded hidden states of documents for in-context
retrieval, employing auto-aggressive decoding to accurately identify the
specific evidence text required for user queries, eliminating the need for
chunking. CFIC is further enhanced by incorporating two decoding strategies,
namely Constrained Sentence Prefix Decoding and Skip Decoding. These strategies
not only improve the efficiency of the retrieval process but also ensure that
the fidelity of the generated grounding text evidence is maintained. Our
evaluations of CFIC on a range of open QA datasets demonstrate its superiority
in retrieving relevant and accurate evidence, offering a significant
improvement over traditional methods. By doing away with the need for document
chunking, CFIC presents a more streamlined, effective, and efficient retrieval
solution, making it a valuable advancement in the field of RAG systems.
Related papers
- Cognitive-Aligned Document Selection for Retrieval-augmented Generation [2.9060210098040855]
We propose GGatrieval to dynamically update queries and filter high-quality, reliable retrieval documents.
We parse the user query into its syntactic components and perform fine-grained grounded alignment with the retrieved documents.
Our approach introduces a novel criterion for filtering retrieved documents, closely emulating human strategies for acquiring targeted information.
arXiv Detail & Related papers (2025-02-17T13:00:15Z) - GeAR: Generation Augmented Retrieval [82.20696567697016]
Document retrieval techniques form the foundation for the development of large-scale information systems.
The prevailing methodology is to construct a bi-encoder and compute the semantic similarity.
We propose a new method called $textbfGe$neration that incorporates well-designed fusion and decoding modules.
arXiv Detail & Related papers (2025-01-06T05:29:00Z) - Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks [11.053340674721005]
Retrieval-augmented generation (RAG) has gained traction as a powerful approach for enhancing language models by integrating external knowledge sources.
This paper proposes an alternative paradigm, cache-augmented generation (CAG) that bypasses real-time retrieval.
arXiv Detail & Related papers (2024-12-20T06:58:32Z) - RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation [21.764973680014368]
RetroLLM is a unified framework that integrates retrieval and generation into a single, cohesive process.
To mitigate false pruning in the process of constrained evidence generation, we introduce hierarchical FM-Index constraints.
Experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks.
arXiv Detail & Related papers (2024-12-16T16:03:25Z) - Is Semantic Chunking Worth the Computational Cost? [0.0]
This study systematically evaluates the effectiveness of semantic chunking using three common retrieval-related tasks.
The results show that the computational costs associated with semantic chunking are not justified by consistent performance gains.
arXiv Detail & Related papers (2024-10-16T21:53:48Z) - SparseCL: Sparse Contrastive Learning for Contradiction Retrieval [87.02936971689817]
Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query.
Existing methods such as similarity search and crossencoder models exhibit significant limitations.
We introduce SparseCL that leverages specially trained sentence embeddings designed to preserve subtle, contradictory nuances between sentences.
arXiv Detail & Related papers (2024-06-15T21:57:03Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.
Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)
We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - Learning to Filter Context for Retrieval-Augmented Generation [75.18946584853316]
Generation models are required to generate outputs given partially or entirely irrelevant passages.
FILCO identifies useful context based on lexical and information-theoretic approaches.
It trains context filtering models that can filter retrieved contexts at test time.
arXiv Detail & Related papers (2023-11-14T18:41:54Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.