A Study on Accuracy, Miscalibration, and Popularity Bias in
Recommendations
- URL: http://arxiv.org/abs/2303.00400v1
- Date: Wed, 1 Mar 2023 10:39:58 GMT
- Title: A Study on Accuracy, Miscalibration, and Popularity Bias in
Recommendations
- Authors: Dominik Kowald and Gregor Mayr and Markus Schedl and Elisabeth Lex
- Abstract summary: We study how different genres affect the inconsistency of recommendation performance.
We find that users with little interest in popular content receive the worst recommendation accuracy.
Our experiments show that particular genres contribute to a different extent to the inconsistency of recommendation performance.
- Score: 6.694971161661218
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has suggested different metrics to measure the inconsistency
of recommendation performance, including the accuracy difference between user
groups, miscalibration, and popularity lift. However, a study that relates
miscalibration and popularity lift to recommendation accuracy across different
user groups is still missing. Additionally, it is unclear if particular genres
contribute to the emergence of inconsistency in recommendation performance
across user groups. In this paper, we present an analysis of these three
aspects of five well-known recommendation algorithms for user groups that
differ in their preference for popular content. Additionally, we study how
different genres affect the inconsistency of recommendation performance, and
how this is aligned with the popularity of the genres. Using data from LastFm,
MovieLens, and MyAnimeList, we present two key findings. First, we find that
users with little interest in popular content receive the worst recommendation
accuracy, and that this is aligned with miscalibration and popularity lift.
Second, our experiments show that particular genres contribute to a different
extent to the inconsistency of recommendation performance, especially in terms
of miscalibration in the case of the MyAnimeList dataset.
Related papers
- Preference Diffusion for Recommendation [50.8692409346126]
We propose PreferDiff, a tailored optimization objective for DM-based recommenders.
PreferDiff transforms BPR into a log-likelihood ranking objective to better capture user preferences.
It is the first personalized ranking loss designed specifically for DM-based recommenders.
arXiv Detail & Related papers (2024-10-17T01:02:04Z) - Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems [74.47680026838128]
Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias.
We consider multifactorial selection bias affected by both item and rating value factors.
We propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization.
arXiv Detail & Related papers (2024-04-29T12:18:21Z) - Metrics for popularity bias in dynamic recommender systems [0.0]
Biased recommendations may lead to decisions that can potentially have adverse effects on individuals, sensitive user groups, and society.
This paper focuses on quantifying popularity bias that stems directly from the output of RecSys models.
Four metrics to quantify popularity bias in RescSys over time in dynamic setting across different sensitive user groups have been proposed.
arXiv Detail & Related papers (2023-10-12T16:15:30Z) - Improving Recommendation System Serendipity Through Lexicase Selection [53.57498970940369]
We propose a new serendipity metric to measure the presence of echo chambers and homophily in recommendation systems.
We then attempt to improve the diversity-preservation qualities of well known recommendation techniques by adopting a parent selection algorithm known as lexicase selection.
Our results show that lexicase selection, or a mixture of lexicase selection and ranking, outperforms its purely ranked counterparts in terms of personalization, coverage and our specifically designed serendipity benchmark.
arXiv Detail & Related papers (2023-05-18T15:37:38Z) - Simpson's Paradox in Recommender Fairness: Reconciling differences
between per-user and aggregated evaluations [16.053419956606557]
We argue that two notions of fairness in ranking and recommender systems can lead to opposite conclusions.
We reconcile these notions and show that the tension is due to differences in distributions of users where items are relevant.
Based on this new understanding, practitioners might be interested in either notions, but might face challenges with the per-user metric.
arXiv Detail & Related papers (2022-10-14T12:43:32Z) - Recommendation Systems with Distribution-Free Reliability Guarantees [83.80644194980042]
We show how to return a set of items rigorously guaranteed to contain mostly good items.
Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate.
We evaluate our methods on the Yahoo! Learning to Rank and MSMarco datasets.
arXiv Detail & Related papers (2022-07-04T17:49:25Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - The Unfairness of Popularity Bias in Book Recommendation [0.0]
Popularity bias refers to the problem that popular items are recommended frequently while less popular items are recommended rarely or not at all.
We analyze the well-known Book-Crossing dataset and define three user groups based on their tendency towards popular items.
Our results indicate that most state-of-the-art recommendation algorithms suffer from popularity bias in the book domain.
arXiv Detail & Related papers (2022-02-27T20:21:46Z) - Unbiased Pairwise Learning to Rank in Recommender Systems [4.058828240864671]
Unbiased learning to rank algorithms are appealing candidates and have already been applied in many applications with single categorical labels.
We propose a novel unbiased LTR algorithm to tackle the challenges, which innovatively models position bias in the pairwise fashion.
Experiment results on public benchmark datasets and internal live traffic show the superior results of the proposed method for both categorical and continuous labels.
arXiv Detail & Related papers (2021-11-25T06:04:59Z) - Revisiting Popularity and Demographic Biases in Recommender Evaluation
and Effectiveness [6.210698627561645]
We investigate how recommender performance varies according to popularity and demographics.
We find statistically significant differences in recommender performance by both age and gender.
We observe that recommendation utility steadily degrades for older users, and is lower for women than men.
arXiv Detail & Related papers (2021-10-15T20:30:51Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.