From Insight to Intervention: Interpretable Neuron Steering for Controlling Popularity Bias in Recommender Systems
- URL: http://arxiv.org/abs/2601.15122v2
- Date: Wed, 28 Jan 2026 13:47:43 GMT
- Title: From Insight to Intervention: Interpretable Neuron Steering for Controlling Popularity Bias in Recommender Systems
- Authors: Parviz Ahmadov, Masoud Mansoury,
- Abstract summary: Popularity bias is a pervasive challenge in recommender systems, where a few popular items dominate attention while the majority of less popular items remain underexposed.<n>In this paper, we propose a post-hoc approach, PopSteer, that leverages a Sparse Autoencoder to both interpret and mitigate popularity bias in recommendation models.<n> Experiments on three public datasets with a sequential recommendation model demonstrate that PopSteer significantly enhances fairness with minimal impact on accuracy, while providing interpretable insights and fine-grained control over the fairness-accuracy trade-off.
- Score: 1.8692254863855962
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Popularity bias is a pervasive challenge in recommender systems, where a few popular items dominate attention while the majority of less popular items remain underexposed. This imbalance can reduce recommendation quality and lead to unfair item exposure. Although existing mitigation methods address this issue to some extent, they often lack transparency in how they operate. In this paper, we propose a post-hoc approach, PopSteer, that leverages a Sparse Autoencoder (SAE) to both interpret and mitigate popularity bias in recommendation models. The SAE is trained to replicate a trained model's behavior while enabling neuron-level interpretability. By introducing synthetic users with strong preferences for either popular or unpopular items, we identify neurons encoding popularity signals through their activation patterns. We then steer recommendations by adjusting the activations of the most biased neurons. Experiments on three public datasets with a sequential recommendation model demonstrate that PopSteer significantly enhances fairness with minimal impact on accuracy, while providing interpretable insights and fine-grained control over the fairness-accuracy trade-off.
Related papers
- The Unfairness of Multifactorial Bias in Recommendation [68.35079031029616]
Popularity bias and positivity bias are prominent sources of bias in recommender systems.<n>In this work, we examine how multifactorial bias influences item-side fairness.<n>We adapt a percentile-based rating transformation as a pre-processing strategy to mitigate multifactorial bias.
arXiv Detail & Related papers (2026-01-19T08:37:43Z) - Opening the Black Box: Interpretable Remedies for Popularity Bias in Recommender Systems [1.8692254863855962]
Popularity bias is a well-known challenge in recommender systems, where a small number of popular items receive disproportionate attention.<n>This imbalance often results in reduced recommendation quality and unfair exposure of items.<n>We propose a post-hoc method using a Sparse Autoencoder to interpret and mitigate popularity bias in deep recommendation models.
arXiv Detail & Related papers (2025-08-24T10:59:56Z) - PBiLoss: Popularity-Aware Regularization to Improve Fairness in Graph-Based Recommender Systems [1.0128808054306186]
We propose PBiLoss, a regularization-based loss function designed to counteract popularity bias in graph-based recommender models explicitly.<n>We show that PBiLoss significantly improves fairness, as demonstrated by reductions in the Popularity-Rank Correlation for Users (PRU) and Popularity-Rank Correlation for Items (PRI)
arXiv Detail & Related papers (2025-07-25T08:29:32Z) - Finding Interest Needle in Popularity Haystack: Improving Retrieval by Modeling Item Exposure [8.3095709445007]
We introduce an exposure-aware retrieval scoring approach, which explicitly models item exposure probability and adjusts retrieval-stage ranking at inference time.<n>We validate our approach through online A/B experiments in a real-world video recommendation system, demonstrating a 25% increase in uniquely retrieved items and a 40% reduction in the dominance of over-popular content.<n>Our results establish a scalable, deployable solution for mitigating popularity bias at the retrieval stage, offering a new paradigm for bias-aware personalization.
arXiv Detail & Related papers (2025-03-31T00:04:01Z) - Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs [6.106023882846559]
This paper investigates the capability of large language models to address the popularity bias in recommender systems of third-party libraries (TPLs)<n>We conduct an ablation study experimenting with state-of-the-art techniques to mitigate the popularity bias, including fine-tuning and popularity penalty mechanisms.<n>Our findings reveal that the considered LLMs cannot address the popularity bias in TPL recommenders, even though fine-tuning and post-processing penalty mechanism contributes to increasing the overall diversity of the provided recommendations.
arXiv Detail & Related papers (2025-01-17T17:35:14Z) - Towards Popularity-Aware Recommendation: A Multi-Behavior Enhanced Framework with Orthogonality Constraint [4.137753517504481]
Top-$K$ recommendation involves inferring latent user preferences and generating personalized recommendations.<n>We present a textbfPopularity-aware top-$K$ recommendation algorithm integrating multi-behavior textbfSide textbfInformation.
arXiv Detail & Related papers (2024-12-26T11:06:49Z) - Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems [74.47680026838128]
Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias.
We consider multifactorial selection bias affected by both item and rating value factors.
We propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization.
arXiv Detail & Related papers (2024-04-29T12:18:21Z) - Off-policy evaluation for learning-to-rank via interpolating the
item-position model and the position-based model [83.83064559894989]
A critical need for industrial recommender systems is the ability to evaluate recommendation policies offline, before deploying them to production.
We develop a new estimator that mitigates the problems of the two most popular off-policy estimators for rankings.
In particular, the new estimator, called INTERPOL, addresses the bias of a potentially misspecified position-based model.
arXiv Detail & Related papers (2022-10-15T17:22:30Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - PURS: Personalized Unexpected Recommender System for Improving User
Satisfaction [76.98616102965023]
We describe a novel Personalized Unexpected Recommender System (PURS) model that incorporates unexpectedness into the recommendation process.
Extensive offline experiments on three real-world datasets illustrate that the proposed PURS model significantly outperforms the state-of-the-art baseline approaches.
arXiv Detail & Related papers (2021-06-05T01:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.