The Chronicles of RAG: The Retriever, the Chunk and the Generator
- URL: http://arxiv.org/abs/2401.07883v1
- Date: Mon, 15 Jan 2024 18:25:18 GMT
- Title: The Chronicles of RAG: The Retriever, the Chunk and the Generator
- Authors: Paulo Finardi, Leonardo Avila, Rodrigo Castaldoni, Pedro Gengo, Celio
Larcher, Marcos Piau, Pablo Costa, Vinicius Carid\'a
- Abstract summary: This paper presents good practices to implement, optimize, and evaluate RAG for the Brazilian Portuguese language.
We explore a diverse set of methods to answer questions about the first Harry Potter book.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval Augmented Generation (RAG) has become one of the most popular
paradigms for enabling LLMs to access external data, and also as a mechanism
for grounding to mitigate against hallucinations. When implementing RAG you can
face several challenges like effective integration of retrieval models,
efficient representation learning, data diversity, computational efficiency
optimization, evaluation, and quality of text generation. Given all these
challenges, every day a new technique to improve RAG appears, making it
unfeasible to experiment with all combinations for your problem. In this
context, this paper presents good practices to implement, optimize, and
evaluate RAG for the Brazilian Portuguese language, focusing on the
establishment of a simple pipeline for inference and experiments. We explored a
diverse set of methods to answer questions about the first Harry Potter book.
To generate the answers we used the OpenAI's gpt-4, gpt-4-1106-preview,
gpt-3.5-turbo-1106, and Google's Gemini Pro. Focusing on the quality of the
retriever, our approach achieved an improvement of MRR@10 by 35.4% compared to
the baseline. When optimizing the input size in the application, we observed
that it is possible to further enhance it by 2.4%. Finally, we present the
complete architecture of the RAG with our recommendations. As result, we moved
from a baseline of 57.88% to a maximum relative score of 98.61%.
Related papers
- Benchmarking LLMs for Optimization Modeling and Enhancing Reasoning via Reverse Socratic Synthesis [60.23133327001978]
Large language models (LLMs) have exhibited their problem-solving ability in mathematical reasoning.
We propose E-OPT, a benchmark for end-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z) - CRAG -- Comprehensive RAG Benchmark [58.15980697921195]
Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge.
Existing RAG datasets do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks.
We introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search.
arXiv Detail & Related papers (2024-06-07T08:43:07Z) - GenQREnsemble: Zero-Shot LLM Ensemble Prompting for Generative Query Reformulation [5.793298194062544]
We propose an ensemble based prompting technique, GenQREnsemble, to generate multiple sets of keywords.
On evaluations over four IR benchmarks, we find that GenQREnsemble generates better reformulations with relative nDCG@10 improvements up to 18% and MAP improvements upto 24% over the previous zero-shot state-of-art.
arXiv Detail & Related papers (2024-04-04T18:35:25Z) - Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers [0.0]
Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q&A (Question-Answering) systems.
We propose the 'Blended RAG' method of leveraging semantic search techniques, such as Vector indexes and Sparse indexes, blended with hybrid query strategies.
Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets.
arXiv Detail & Related papers (2024-03-22T17:13:46Z) - ChatQA: Surpassing GPT-4 on Conversational QA and RAG [43.34692996785167]
We introduce ChatQA, a suite of models that outperform GPT-4 on retrieval-augmented generation (RAG) and conversational question answering (QA)
For effective retrieval, we introduce a dense retriever optimized for conversational QA, which yields results comparable to the alternative state-of-the-art query rewriting models.
We present the ChatRAG Bench, which encompasses ten datasets covering comprehensive evaluations on RAG, table-related QA, arithmetic calculations, and scenarios involving unanswerable questions.
arXiv Detail & Related papers (2024-01-18T18:59:11Z) - Tool-Augmented Reward Modeling [58.381678612409]
We propose a tool-augmented preference modeling approach, named Themis, to address limitations by empowering RMs with access to external environments.
Our study delves into the integration of external tools into RMs, enabling them to interact with diverse external sources.
In human evaluations, RLHF trained with Themis attains an average win rate of 32% when compared to baselines.
arXiv Detail & Related papers (2023-10-02T09:47:40Z) - Cumulative Reasoning with Large Language Models [12.267474250936123]
Cumulative Reasoning (CR) is a novel approach that utilizes language models cumulatively and iteratively.
We demonstrate CR's superiority through several complex reasoning tasks.
CR sets new state-of-the-art on the MATH dataset.
arXiv Detail & Related papers (2023-08-08T16:18:20Z) - cTBLS: Augmenting Large Language Models with Conversational Tables [0.76146285961466]
Conversational Tables (cTBLS) is a three-step architecture to retrieve and generate dialogue responses grounded on retrieved tabular information.
Human evaluators prefer cTBLs +80% of the time (coherency, fluency) and judge informativeness to be 4x better than the previous state-of-the-art.
arXiv Detail & Related papers (2023-03-21T17:04:44Z) - LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark
Suite for Lasso [84.6451154376526]
LassoBench is a new benchmark suite tailored for an important open research topic in the Lasso community.
We evaluate 5 state-of-the-art HPO methods and 3 baselines, and demonstrate that Bayesian optimization, in particular, can improve over the methods commonly used for sparse regression.
arXiv Detail & Related papers (2021-11-04T12:05:09Z) - Adversarial Retriever-Ranker for dense text retrieval [51.87158529880056]
We present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
AR2 consistently and significantly outperforms existing dense retriever methods.
This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%)
arXiv Detail & Related papers (2021-10-07T16:41:15Z) - Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects.
We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers.
We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.