The Chronicles of RAG: The Retriever, the Chunk and the Generator
- URL: http://arxiv.org/abs/2401.07883v1
- Date: Mon, 15 Jan 2024 18:25:18 GMT
- Title: The Chronicles of RAG: The Retriever, the Chunk and the Generator
- Authors: Paulo Finardi, Leonardo Avila, Rodrigo Castaldoni, Pedro Gengo, Celio
Larcher, Marcos Piau, Pablo Costa, Vinicius Carid\'a
- Abstract summary: This paper presents good practices to implement, optimize, and evaluate RAG for the Brazilian Portuguese language.
We explore a diverse set of methods to answer questions about the first Harry Potter book.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval Augmented Generation (RAG) has become one of the most popular
paradigms for enabling LLMs to access external data, and also as a mechanism
for grounding to mitigate against hallucinations. When implementing RAG you can
face several challenges like effective integration of retrieval models,
efficient representation learning, data diversity, computational efficiency
optimization, evaluation, and quality of text generation. Given all these
challenges, every day a new technique to improve RAG appears, making it
unfeasible to experiment with all combinations for your problem. In this
context, this paper presents good practices to implement, optimize, and
evaluate RAG for the Brazilian Portuguese language, focusing on the
establishment of a simple pipeline for inference and experiments. We explored a
diverse set of methods to answer questions about the first Harry Potter book.
To generate the answers we used the OpenAI's gpt-4, gpt-4-1106-preview,
gpt-3.5-turbo-1106, and Google's Gemini Pro. Focusing on the quality of the
retriever, our approach achieved an improvement of MRR@10 by 35.4% compared to
the baseline. When optimizing the input size in the application, we observed
that it is possible to further enhance it by 2.4%. Finally, we present the
complete architecture of the RAG with our recommendations. As result, we moved
from a baseline of 57.88% to a maximum relative score of 98.61%.
Related papers
- Towards Optimizing a Retrieval Augmented Generation using Large Language Model on Academic Data [4.322454918650575]
We focus on data retrieval, specifically targeting various study programs at a large technical university.
By exploring the integration of both open-source (e.g., Llama2, Mistral) and closed-source (GPT-3.5 and GPT-4) Large Language Models, we offer valuable insights into the application and optimization of RAG frameworks in domain-specific contexts.
arXiv Detail & Related papers (2024-11-13T08:43:37Z) - Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications [0.8999666725996975]
This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project technical documents.
A hybrid dataset, Telco-DPR, is presented, combining text and tables, and includes a set of synthetic question/answer pairs.
The retrieval models are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR)
The proposed QA system, using the developed RAG model and the Generative Pretrained Transformer (GPT)-4, achieves a 14% improvement in answer accuracy.
arXiv Detail & Related papers (2024-10-15T16:37:18Z) - RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation [54.707460684650584]
Large Language Models (LLMs) demonstrate human-level capabilities in dialogue, reasoning, and knowledge retention.
Current research addresses this bottleneck by equipping LLMs with external knowledge, a technique known as Retrieval Augmented Generation (RAG)
RAGLAB is a modular and research-oriented open-source library that reproduces 6 existing algorithms and provides a comprehensive ecosystem for investigating RAG algorithms.
arXiv Detail & Related papers (2024-08-21T07:20:48Z) - Optimizing Query Generation for Enhanced Document Retrieval in RAG [53.10369742545479]
Large Language Models (LLMs) excel in various language tasks but they often generate incorrect information.
Retrieval-Augmented Generation (RAG) aims to mitigate this by using document retrieval for accurate responses.
arXiv Detail & Related papers (2024-07-17T05:50:32Z) - CRAG -- Comprehensive RAG Benchmark [58.15980697921195]
Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge.
Existing RAG datasets do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks.
To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG)
CRAG is a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search.
arXiv Detail & Related papers (2024-06-07T08:43:07Z) - GenQREnsemble: Zero-Shot LLM Ensemble Prompting for Generative Query Reformulation [5.793298194062544]
We propose an ensemble based prompting technique, GenQREnsemble, to generate multiple sets of keywords.
On evaluations over four IR benchmarks, we find that GenQREnsemble generates better reformulations with relative nDCG@10 improvements up to 18% and MAP improvements upto 24% over the previous zero-shot state-of-art.
arXiv Detail & Related papers (2024-04-04T18:35:25Z) - Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers [0.0]
Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q&A (Question-Answering) systems.
We propose the 'Blended RAG' method of leveraging semantic search techniques, such as Vector indexes and Sparse indexes, blended with hybrid query strategies.
Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets.
arXiv Detail & Related papers (2024-03-22T17:13:46Z) - ChatQA: Surpassing GPT-4 on Conversational QA and RAG [43.34692996785167]
We introduce ChatQA, a suite of models that outperform GPT-4 on retrieval-augmented generation (RAG) and conversational question answering (QA)
For effective retrieval, we introduce a dense retriever optimized for conversational QA, which yields results comparable to the alternative state-of-the-art query rewriting models.
We present the ChatRAG Bench, which encompasses ten datasets covering comprehensive evaluations on RAG, table-related QA, arithmetic calculations, and scenarios involving unanswerable questions.
arXiv Detail & Related papers (2024-01-18T18:59:11Z) - Tool-Augmented Reward Modeling [58.381678612409]
We propose a tool-augmented preference modeling approach, named Themis, to address limitations by empowering RMs with access to external environments.
Our study delves into the integration of external tools into RMs, enabling them to interact with diverse external sources.
In human evaluations, RLHF trained with Themis attains an average win rate of 32% when compared to baselines.
arXiv Detail & Related papers (2023-10-02T09:47:40Z) - Adversarial Retriever-Ranker for dense text retrieval [51.87158529880056]
We present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
AR2 consistently and significantly outperforms existing dense retriever methods.
This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%)
arXiv Detail & Related papers (2021-10-07T16:41:15Z) - Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects.
We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers.
We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.