RAG-Reward: Optimizing RAG with Reward Modeling and RLHF
- URL: http://arxiv.org/abs/2501.13264v2
- Date: Tue, 18 Feb 2025 04:22:50 GMT
- Title: RAG-Reward: Optimizing RAG with Reward Modeling and RLHF
- Authors: Hanning Zhang, Juntong Song, Juno Zhu, Yuanhao Wu, Tong Zhang, Cheng Niu,
- Abstract summary: Retrieval-augmented generation (RAG) enhances Large Language Models (LLMs) with relevant and up-to-date knowledge.
The role of reward models in reinforcement learning for optimizing RAG remains underexplored.
We introduce textbfRAG-Reward, a framework designed to develop reward models.
- Score: 8.911260109659489
- License:
- Abstract: Retrieval-augmented generation (RAG) enhances Large Language Models (LLMs) with relevant and up-to-date knowledge, improving their ability to answer knowledge-intensive questions. It has been shown to enhance both generation quality and trustworthiness. While numerous works have focused on improving retrieval, generation, and evaluation, the role of reward models in reinforcement learning for optimizing RAG remains underexplored. In this paper, we introduce \textbf{RAG-Reward}, a framework designed to develop reward models to enable \textit{hallucination-free, comprehensive, reliable, and efficient RAG}. We define four key metrics to assess generation quality and develop an automated benchmarking pipeline to evaluate the outputs of multiple LLMs across a variety of RAG scenarios. Using \textbf{RAG-Reward}, we train reward models and apply {reinforcement learning with human feedback (RLHF)} to improve LLMs' effectiveness in RAG. Experimental results demonstrate that our reward model achieves state-of-the-art performance in automatic benchmarking and aligns closely with human evaluations. Furthermore, the improved generation quality of the trained policy model highlights the feasibility and efficiency of using RLHF to enhance RAG outputs.
Related papers
- Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs [58.18140409409302]
Large Language Models (LLMs) have made substantial strides in structured tasks through Reinforcement Learning (RL)
Applying RL in broader domains like chatbots and content generation presents unique challenges.
We show a case study of reproducing existing reward model ensemble research using embedding-based reward models.
arXiv Detail & Related papers (2025-02-04T19:37:35Z) - Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.
Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - Reward-RAG: Enhancing RAG with Reward Driven Supervision [43.66966457772646]
We introduce Reward-RAG, a novel approach designed to enhance the Retrieval-Augmented Generation (RAG) model through Reward-Driven Supervision.
Unlike previous RAG methodologies, our method adapts retrieval information to specific domains by employing CriticGPT to train a dedicated reward model.
This reward model generates synthesized datasets for fine-tuning the RAG, aligning its outputs more closely with human preferences.
arXiv Detail & Related papers (2024-10-03T15:26:50Z) - SFR-RAG: Towards Contextually Faithful LLMs [57.666165819196486]
Retrieval Augmented Generation (RAG) is a paradigm that integrates external contextual information with large language models (LLMs) to enhance factual accuracy and relevance.
We introduce SFR-RAG, a small LLM that is instruction-textual with an emphasis on context-grounded generation and hallucination.
We also present ConBench, a new evaluation framework compiling multiple popular and diverse RAG benchmarks.
arXiv Detail & Related papers (2024-09-16T01:08:18Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.
With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance.
Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - HAF-RM: A Hybrid Alignment Framework for Reward Model Training [51.59246299566669]
We propose a hybrid alignment framework HaF-RM for reward model training.
It offers a principled and effective approach to enhancing the performance and alignment of reward models.
arXiv Detail & Related papers (2024-07-04T23:26:56Z) - Retrieval-Augmented Generation for AI-Generated Content: A Survey [38.50754568320154]
Retrieval-Augmented Generation (RAG) has emerged as a paradigm to address such challenges.
RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores.
In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios.
arXiv Detail & Related papers (2024-02-29T18:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.