Searching for Best Practices in Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2407.01219v1
- Date: Mon, 1 Jul 2024 12:06:34 GMT
- Title: Searching for Best Practices in Retrieval-Augmented Generation
- Authors: Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang,
- Abstract summary: Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information.
Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices.
We suggest several strategies for deploying RAG that balance both performance and efficiency.
- Score: 31.438681543849224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times. Typically, a RAG workflow involves multiple processing steps, each of which can be executed in various ways. Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices. Through extensive experiments, we suggest several strategies for deploying RAG that balance both performance and efficiency. Moreover, we demonstrate that multimodal retrieval techniques can significantly enhance question-answering capabilities about visual inputs and accelerate the generation of multimodal content using a "retrieval as generation" strategy.
Related papers
- Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach [6.549143816134531]
We propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer.
Experimental results on the multi-hop question-answering HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-07-18T02:19:00Z) - Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting [68.90949377014742]
Speculative RAG is a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM.
Our method accelerates RAG by delegating drafting to the smaller specialist LM, with the larger generalist LM performing a single verification pass over the drafts.
It notably enhances accuracy by up to 12.97% while reducing latency by 51% compared to conventional RAG systems on PubHealth.
arXiv Detail & Related papers (2024-07-11T06:50:19Z) - Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal.
Existing active retrieval methods face two challenges: 1.
They usually rely on a single criterion, which struggles with handling various types of instructions.
They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z) - Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning [49.3242278912771]
We introduce a novel multimodal RAG framework named RMR (Retrieval Meets Reasoning)
The RMR framework employs a bi-modal retrieval module to identify the most relevant question-answer pairs.
It significantly boosts the performance of various vision-language models across a spectrum of benchmark datasets.
arXiv Detail & Related papers (2024-05-31T14:23:49Z) - FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research [32.820100519805486]
FlashRAG is an efficient and modular open-source toolkit designed to assist researchers in reproducing existing RAG methods and in developing their own RAG algorithms within a unified framework.
Our toolkit has various features, including customizable modular framework, rich collection of pre-implemented RAG works, comprehensive datasets, efficient auxiliary pre-processing scripts, and extensive and standard evaluation metrics.
arXiv Detail & Related papers (2024-05-22T12:12:40Z) - Improving Retrieval for RAG based Question Answering Models on Financial Documents [0.0]
This paper explores the existing constraints of RAG pipelines and introduces methodologies for enhancing text retrieval.
It delves into strategies such as sophisticated chunking techniques, query expansion, the incorporation of metadata annotations, the application of re-ranking algorithms, and the fine-tuning of embedding algorithms.
arXiv Detail & Related papers (2024-03-23T00:49:40Z) - Retrieval-Augmented Generation for AI-Generated Content: A Survey [38.50754568320154]
Retrieval-Augmented Generation (RAG) has emerged as a paradigm to address such challenges.
RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores.
In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios.
arXiv Detail & Related papers (2024-02-29T18:59:01Z) - Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents.
There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain.
This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z) - CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources.
This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z) - REX: Rapid Exploration and eXploitation for AI Agents [103.68453326880456]
We propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.
REX introduces an additional layer of rewards and integrates concepts similar to Upper Confidence Bound (UCB) scores, leading to more robust and efficient AI agent performance.
arXiv Detail & Related papers (2023-07-18T04:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.