Related papers: Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation

Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2409.11598v4
Date: Fri, 04 Jul 2025 20:56:35 GMT
Title: Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
Authors: To Eun Kim, Fernando Diaz,
Abstract summary: This paper is the first systematic evaluation of RAG systems that integrate fairness-aware rankings.<n>We show that incorporating fairness-aware retrieval often maintains or even enhances both ranking quality and generation quality.
Score: 53.285436927963865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the central role of retrieval in retrieval-augmented generation (RAG) systems, much of the existing research on RAG overlooks the well-established field of fair ranking and fails to account for the interests of all stakeholders involved. In this paper, we conduct the first systematic evaluation of RAG systems that integrate fairness-aware rankings, addressing both ranking fairness and attribution fairness, which ensures equitable exposure of the sources cited in the generated content. Our evaluation focuses on measuring item-side fairness, specifically the fair exposure of relevant items retrieved by RAG systems, and investigates how this fairness impacts both the effectiveness of the systems and the attribution of sources in the generated output that users ultimately see. By experimenting with twelve RAG models across seven distinct tasks, we show that incorporating fairness-aware retrieval often maintains or even enhances both ranking quality and generation quality, countering the common belief that fairness compromises system performance. Additionally, we demonstrate that fair retrieval practices lead to more balanced attribution in the final responses, ensuring that the generator fairly cites the sources it relies on. Our findings underscore the importance of item-side fairness in retrieval and generation, laying the foundation for responsible and equitable RAG systems and guiding future research in fair ranking and attribution.

Related papers

A Reproducibility Study of Product-side Fairness in Bundle Recommendation [50.09508982179837]
We study product-side fairness in bundle recommendation (BR) across three real-world datasets.<n>Our results show that exposure patterns differ notably between bundles and items, revealing the need for fairness interventions.<n>We also find that fairness assessments vary considerably depending on the metric used, reinforcing the need for multi-faceted evaluation.
arXiv Detail & Related papers (2025-07-18T20:06:39Z)
Fairness under Competition [10.003345361182628]
We consider the effects of adopting fair classifiers on the overall level of ecosystem fairness.<n>We show that even if competing classifiers are individually fair, the ecosystem's outcome may be unfair.
arXiv Detail & Related papers (2025-05-22T06:43:15Z)
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation [73.16564415490113]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by retrieving relevant document from external knowledge sources. We propose two approaches, FairFT and FairFilter, to mitigate the fairness issues introduced by RAG for small-scale LLMs.
arXiv Detail & Related papers (2025-04-11T10:17:10Z)
FAIR-QR: Enhancing Fairness-aware Information Retrieval through Query Refinement [1.8577028544235155]
We propose a novel framework that refines query keywords to retrieve documents from underrepresented groups and achieve group fairness. Our method not only shows promising retrieval results regarding relevance and fairness but also interpretability by showing refined keywords used at each iteration.
arXiv Detail & Related papers (2025-03-27T02:10:19Z)
FairDgcl: Fairness-aware Recommendation with Dynamic Graph Contrastive Learning [48.38344934125999]
We study how to implement high-quality data augmentation to improve recommendation fairness. Specifically, we propose FairDgcl, a dynamic graph adversarial contrastive learning framework. We show that FairDgcl can simultaneously generate enhanced representations that possess both fairness and accuracy.
arXiv Detail & Related papers (2024-10-23T04:43:03Z)
Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question. We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat. We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z)
Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems [18.926129063000264]
RAG (Retrieval-Augmented Generation) have recently gained significant attention for their enhanced ability to integrate external knowledge sources. We propose a fairness evaluation framework tailored to RAG methods, using scenario-based questions and analyzing disparities across demographic attributes.
arXiv Detail & Related papers (2024-09-29T22:04:26Z)
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs) We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z)
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation [61.14660526363607]
We propose a fine-grained evaluation framework, RAGChecker, that incorporates a suite of diagnostic metrics for both the retrieval and generation modules. RAGChecker has significantly better correlations with human judgments than other evaluation metrics. The metrics of RAGChecker can guide researchers and practitioners in developing more effective RAG systems.
arXiv Detail & Related papers (2024-08-15T10:20:54Z)
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios. With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance. Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems [0.0]
Retrieval-Augmented Generation (RAG) has become a standard architectural pattern for domain-specific knowledge into user-facing chat applications. We introduce RAGBench: the first comprehensive, large-scale RAG benchmark dataset of 100k examples. We formalize the TRACe evaluation framework: a set of explainable and actionable RAG evaluation metrics applicable across all RAG domains.
arXiv Detail & Related papers (2024-06-25T20:23:15Z)
Evaluation of Retrieval-Augmented Generation: A Survey [13.633909177683462]
We provide a comprehensive overview of the evaluation and benchmarks of Retrieval-Augmented Generation (RAG) systems. Specifically, we examine and compare several quantifiable metrics of the Retrieval and Generation components, such as relevance, accuracy, and faithfulness. We then analyze the various datasets and metrics, discuss the limitations of current benchmarks, and suggest potential directions to advance the field of RAG benchmarks.
arXiv Detail & Related papers (2024-05-13T02:33:25Z)
Fairness in Reinforcement Learning: A Survey [0.0]
We survey the literature to provide the most up-to-date snapshot of the frontiers of fairness in reinforcement learning. We highlight the methodologies researchers used to implement fairness in single- and multi-agent RL systems. We critically examine gaps in the literature, such as understanding fairness in the context of RLHF.
arXiv Detail & Related papers (2024-05-11T04:36:46Z)
Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models [21.115495457454365]
uRAG is a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. We build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine.
arXiv Detail & Related papers (2024-04-30T19:51:37Z)
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems [46.522527144802076]
We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems. ARES finetunes lightweight LM judges to assess the quality of individual RAG components. We make our code and datasets publicly available on Github.
arXiv Detail & Related papers (2023-11-16T00:39:39Z)
Fairness Score and Process Standardization: Framework for Fairness Certification in Artificial Intelligence Systems [0.4297070083645048]
We propose a novel Fairness Score to measure the fairness of a data-driven AI system. It will also provide a framework to operationalise the concept of fairness and facilitate the commercial deployment of such systems.
arXiv Detail & Related papers (2022-01-10T15:45:12Z)
Incentives for Item Duplication under Fair Ranking Policies [69.14168955766847]
We study the behaviour of different fair ranking policies in the presence of duplicates. We find that fairness-aware ranking policies may conflict with diversity, due to their potential to incentivize duplication more than policies solely focused on relevance.
arXiv Detail & Related papers (2021-10-29T11:11:15Z)
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers [9.811131801693856]
We provide a novel framework to measure the fairness in the retrieved text contents of ranking models. We propose an adversarial bias mitigation approach applied to the state-of-the-art Bert rankers. Our results on the MS MARCO benchmark show that, while the fairness of all ranking models is lower than the ones of ranker-agnostic baselines, the fairness in retrieved contents significantly improves when applying the proposed adversarial training.
arXiv Detail & Related papers (2021-04-28T08:53:54Z)
"And the Winner Is...": Dynamic Lotteries for Multi-group Fairness-Aware Recommendation [37.35485045640196]
We argue that the previous literature has been based on simple, uniform and often uni-dimensional notions of fairness assumptions. We explicitly represent the design decisions that enter into the trade-off between accuracy and fairness across multiply-defined and intersecting protected groups. We formulate lottery-based mechanisms for choosing between fairness concerns, and demonstrate their performance in two recommendation domains.
arXiv Detail & Related papers (2020-09-05T20:15:14Z)
Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers. This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.