Bias Mitigation for AI-Feedback Loops in Recommender Systems: A Systematic Literature Review and Taxonomy
- URL: http://arxiv.org/abs/2509.00109v1
- Date: Thu, 28 Aug 2025 09:12:38 GMT
- Title: Bias Mitigation for AI-Feedback Loops in Recommender Systems: A Systematic Literature Review and Taxonomy
- Authors: Theodor Stoecker, Samed Bayer, Ingo Weber,
- Abstract summary: We present a systematic literature review of bias mitigation methods.<n>They explicitly consider AI feedback loops and are validated in multi-round simulations or live A/B tests.<n>Each study is coded on six dimensions: mitigation technique, biases addressed, dynamic testing set-up, evaluation focus, application domain, and ML task, organising them into a reusable taxonomy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recommender systems continually retrain on user reactions to their own predictions, creating AI feedback loops that amplify biases and diminish fairness over time. Despite this well-known risk, most bias mitigation techniques are tested only on static splits, so their long-term fairness across multiple retraining rounds remains unclear. We therefore present a systematic literature review of bias mitigation methods that explicitly consider AI feedback loops and are validated in multi-round simulations or live A/B tests. Screening 347 papers yields 24 primary studies published between 2019-2025. Each study is coded on six dimensions: mitigation technique, biases addressed, dynamic testing set-up, evaluation focus, application domain, and ML task, organising them into a reusable taxonomy. The taxonomy offers industry practitioners a quick checklist for selecting robust methods and gives researchers a clear roadmap to the field's most urgent gaps. Examples include the shortage of shared simulators, varying evaluation metrics, and the fact that most studies report either fairness or performance; only six use both.
Related papers
- BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses [32.58830706120845]
Existing studies on bias mitigation methods for large language models (LLMs) use diverse baselines and metrics to evaluate debiasing performance.<n>We introduce BiasFreeBench, an empirical benchmark that comprehensively compares eight mainstream bias mitigation techniques.<n>We will publicly release our benchmark, aiming to establish a unified testbed for bias mitigation research.
arXiv Detail & Related papers (2025-09-30T19:56:54Z) - Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplifications and Resistance in Multi-Agent Based LLM-as-Judge [70.89799989428367]
We conduct a systematic analysis of four diverse bias types: position bias, verbosity bias, chain-of-thought bias, and bandwagon bias.<n>We evaluate these biases across two widely adopted multi-agent LLM-as-Judge frameworks: Multi-Agent-Debate and LLM-as-Meta-Judge.
arXiv Detail & Related papers (2025-05-26T03:56:41Z) - Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems [74.47680026838128]
Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias.
We consider multifactorial selection bias affected by both item and rating value factors.
We propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization.
arXiv Detail & Related papers (2024-04-29T12:18:21Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time (Extended Version) [17.397081564454293]
Malware detectors often experience performance decay due to constantly evolving operating systems and attack methods.<n>This paper argues that commonly reported results are inflated due to two pervasive sources of experimental bias in the detection task.
arXiv Detail & Related papers (2024-02-02T12:27:32Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Unbiased Learning to Rank with Biased Continuous Feedback [5.561943356123711]
Unbiased learning-to-rank(LTR) algorithms are verified to model the relative relevance accurately based on noisy feedback.
To provide personalized high-quality recommendation results, recommender systems need model both categorical and continuous biased feedback.
We introduce the pairwise trust bias to separate the position bias, trust bias, and user relevance explicitly.
Experiment results on public benchmark datasets and internal live traffic of a large-scale recommender system at Tencent News show superior results for continuous labels.
arXiv Detail & Related papers (2023-03-08T02:14:08Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey [30.637712832450525]
We collect a total of 341 publications concerning bias mitigation for ML classifiers.
We investigate how existing bias mitigation methods are evaluated in the literature.
Based on the gathered insights, we hope to support practitioners in making informed choices when developing and evaluating new bias mitigation methods.
arXiv Detail & Related papers (2022-07-14T17:16:45Z) - An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF)
Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness.
We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z) - Are Bias Mitigation Techniques for Deep Learning Effective? [24.84797949716142]
We introduce an improved evaluation protocol, sensible metrics, and a new dataset.
We evaluate seven state-of-the-art algorithms using the same network architecture.
We find that algorithms exploit hidden biases, are unable to scale to multiple forms of bias, and are highly sensitive to the choice of tuning set.
arXiv Detail & Related papers (2021-04-01T00:14:45Z) - Contrastive Learning for Debiased Candidate Generation in Large-Scale
Recommender Systems [84.3996727203154]
We show that a popular choice of contrastive loss is equivalent to reducing the exposure bias via inverse propensity weighting.
We further improve upon CLRec and propose Multi-CLRec, for accurate multi-intention aware bias reduction.
Our methods have been successfully deployed in Taobao, where at least four-month online A/B tests and offline analyses demonstrate its substantial improvements.
arXiv Detail & Related papers (2020-05-20T08:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.