Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2409.11598v3
- Date: Tue, 25 Feb 2025 04:13:34 GMT
- Title: Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
- Authors: To Eun Kim, Fernando Diaz,
- Abstract summary: We present the first comprehensive study of RAG systems that incorporate fairness-aware rankings.<n>We find that fairness-aware retrieval frequently retains or even improves ranking effectiveness and generation quality.<n>Our results underscore the importance of item-side fairness throughout both retrieval and generation phases.
- Score: 53.285436927963865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern language models frequently include retrieval components to improve their outputs, giving rise to a growing number of retrieval-augmented generation (RAG) systems. Yet, most existing work in RAG has underemphasized fair ranking techniques and neglected the diverse interests of all stakeholders. In this paper, we present the first comprehensive study of RAG systems that incorporate fairness-aware rankings, focusing on both ranking fairness and attribution fairness - ensuring equitable exposure of sources cited in the final text. We specifically examine item-side fairness, i.e., whether retrieved documents receive balanced exposure, and assess how this affects both the system's overall performance and the eventual distribution of cited sources. Across twelve RAG models and seven tasks, we find that fairness-aware retrieval frequently retains or even improves ranking effectiveness and generation quality, countering the widespread belief that fairness compromises system performance. Moreover, we show that fair retrieval leads to more balanced attribution in the final responses, ensuring that the cited sources are credited more equitably. Our results underscore the importance of item-side fairness throughout both retrieval and generation phases, offering key insights for building more responsible and equitable RAG systems and illustrating promising avenues for future exploration in fair ranking and source attribution.
Related papers
- The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation [73.16564415490113]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by retrieving relevant document from external knowledge sources.
We propose two approaches, FairFT and FairFilter, to mitigate the fairness issues introduced by RAG for small-scale LLMs.
arXiv Detail & Related papers (2025-04-11T10:17:10Z) - FAIR-QR: Enhancing Fairness-aware Information Retrieval through Query Refinement [1.8577028544235155]
We propose a novel framework that refines query keywords to retrieve documents from underrepresented groups and achieve group fairness.
Our method not only shows promising retrieval results regarding relevance and fairness but also interpretability by showing refined keywords used at each iteration.
arXiv Detail & Related papers (2025-03-27T02:10:19Z) - Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question.
We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat.
We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z) - Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems [18.926129063000264]
RAG (Retrieval-Augmented Generation) have recently gained significant attention for their enhanced ability to integrate external knowledge sources.
We propose a fairness evaluation framework tailored to RAG methods, using scenario-based questions and analyzing disparities across demographic attributes.
arXiv Detail & Related papers (2024-09-29T22:04:26Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation [61.14660526363607]
We propose a fine-grained evaluation framework, RAGChecker, that incorporates a suite of diagnostic metrics for both the retrieval and generation modules.
RAGChecker has significantly better correlations with human judgments than other evaluation metrics.
The metrics of RAGChecker can guide researchers and practitioners in developing more effective RAG systems.
arXiv Detail & Related papers (2024-08-15T10:20:54Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.
With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance.
Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems [0.0]
Retrieval-Augmented Generation (RAG) has become a standard architectural pattern for domain-specific knowledge into user-facing chat applications.
We introduce RAGBench: the first comprehensive, large-scale RAG benchmark dataset of 100k examples.
We formalize the TRACe evaluation framework: a set of explainable and actionable RAG evaluation metrics applicable across all RAG domains.
arXiv Detail & Related papers (2024-06-25T20:23:15Z) - Evaluation of Retrieval-Augmented Generation: A Survey [13.633909177683462]
We provide a comprehensive overview of the evaluation and benchmarks of Retrieval-Augmented Generation (RAG) systems.
Specifically, we examine and compare several quantifiable metrics of the Retrieval and Generation components, such as relevance, accuracy, and faithfulness.
We then analyze the various datasets and metrics, discuss the limitations of current benchmarks, and suggest potential directions to advance the field of RAG benchmarks.
arXiv Detail & Related papers (2024-05-13T02:33:25Z) - Fairness in Reinforcement Learning: A Survey [0.0]
We survey the literature to provide the most up-to-date snapshot of the frontiers of fairness in reinforcement learning.
We highlight the methodologies researchers used to implement fairness in single- and multi-agent RL systems.
We critically examine gaps in the literature, such as understanding fairness in the context of RLHF.
arXiv Detail & Related papers (2024-05-11T04:36:46Z) - Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models [21.115495457454365]
uRAG is a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems.
We build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine.
arXiv Detail & Related papers (2024-04-30T19:51:37Z) - ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems [46.522527144802076]
We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems.
ARES finetunes lightweight LM judges to assess the quality of individual RAG components.
We make our code and datasets publicly available on Github.
arXiv Detail & Related papers (2023-11-16T00:39:39Z) - Incentives for Item Duplication under Fair Ranking Policies [69.14168955766847]
We study the behaviour of different fair ranking policies in the presence of duplicates.
We find that fairness-aware ranking policies may conflict with diversity, due to their potential to incentivize duplication more than policies solely focused on relevance.
arXiv Detail & Related papers (2021-10-29T11:11:15Z) - Societal Biases in Retrieved Contents: Measurement Framework and
Adversarial Mitigation for BERT Rankers [9.811131801693856]
We provide a novel framework to measure the fairness in the retrieved text contents of ranking models.
We propose an adversarial bias mitigation approach applied to the state-of-the-art Bert rankers.
Our results on the MS MARCO benchmark show that, while the fairness of all ranking models is lower than the ones of ranker-agnostic baselines, the fairness in retrieved contents significantly improves when applying the proposed adversarial training.
arXiv Detail & Related papers (2021-04-28T08:53:54Z) - "And the Winner Is...": Dynamic Lotteries for Multi-group Fairness-Aware
Recommendation [37.35485045640196]
We argue that the previous literature has been based on simple, uniform and often uni-dimensional notions of fairness assumptions.
We explicitly represent the design decisions that enter into the trade-off between accuracy and fairness across multiply-defined and intersecting protected groups.
We formulate lottery-based mechanisms for choosing between fairness concerns, and demonstrate their performance in two recommendation domains.
arXiv Detail & Related papers (2020-09-05T20:15:14Z) - Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers.
This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.