Harm Mitigation in Recommender Systems under User Preference Dynamics
- URL: http://arxiv.org/abs/2406.09882v1
- Date: Fri, 14 Jun 2024 09:52:47 GMT
- Title: Harm Mitigation in Recommender Systems under User Preference Dynamics
- Authors: Jerry Chee, Shankar Kalyanaraman, Sindhu Kiranmai Ernala, Udi Weinsberg, Sarah Dean, Stratis Ioannidis,
- Abstract summary: We consider a recommender system that takes into account the interplay between recommendations, user interests, and harmful content.
We seek recommendation policies that establish a tradeoff between maximizing click-through rate (CTR) and mitigating harm.
- Score: 16.213153879446796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a recommender system that takes into account the interplay between recommendations, the evolution of user interests, and harmful content. We model the impact of recommendations on user behavior, particularly the tendency to consume harmful content. We seek recommendation policies that establish a tradeoff between maximizing click-through rate (CTR) and mitigating harm. We establish conditions under which the user profile dynamics have a stationary point, and propose algorithms for finding an optimal recommendation policy at stationarity. We experiment on a semi-synthetic movie recommendation setting initialized with real data and observe that our policies outperform baselines at simultaneously maximizing CTR and mitigating harm.
Related papers
- Preference Diffusion for Recommendation [50.8692409346126]
We propose PreferDiff, a tailored optimization objective for DM-based recommenders.
PreferDiff transforms BPR into a log-likelihood ranking objective to better capture user preferences.
It is the first personalized ranking loss designed specifically for DM-based recommenders.
arXiv Detail & Related papers (2024-10-17T01:02:04Z) - Algorithmic Drift: A Simulation Framework to Study the Effects of Recommender Systems on User Preferences [7.552217586057245]
We propose a simulation framework that mimics user-recommender system interactions in a long-term scenario.
We introduce two novel metrics for quantifying the algorithm's impact on user preferences, specifically in terms of drift over time.
arXiv Detail & Related papers (2024-09-24T21:54:22Z) - The Nah Bandit: Modeling User Non-compliance in Recommendation Systems [2.421459418045937]
Expert with Clustering (EWC) is a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning.
EWC outperforms both supervised learning and traditional contextual bandit approaches.
This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
arXiv Detail & Related papers (2024-08-15T03:01:02Z) - Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large
Language Model Recommendation [52.62492168507781]
We propose a novel benchmark called Fairness of Recommendation via LLM (FaiRLLM)
This benchmark comprises carefully crafted metrics and a dataset that accounts for eight sensitive attributes.
By utilizing our FaiRLLM benchmark, we conducted an evaluation of ChatGPT and discovered that it still exhibits unfairness to some sensitive attributes when generating recommendations.
arXiv Detail & Related papers (2023-05-12T16:54:36Z) - Recommending to Strategic Users [10.079698681921673]
We show that users strategically choose content to influence the types of content they get recommended in the future.
We propose three interventions that may improve recommendation quality when taking into account strategic consumption.
arXiv Detail & Related papers (2023-02-13T17:57:30Z) - Learning to Suggest Breaks: Sustainable Optimization of Long-Term User
Engagement [12.843340232167266]
We study the role of breaks in recommendation, and propose a framework for learning optimal breaking policies.
Based on the notion that recommendation dynamics are susceptible to both positive and negative feedback, we cast recommendation as a Lotka-Volterra dynamical system.
arXiv Detail & Related papers (2022-11-24T13:14:29Z) - Recommendation with User Active Disclosing Willingness [20.306413327597603]
We study a novel recommendation paradigm, where the users are allowed to indicate their "willingness" on disclosing different behaviors.
We conduct extensive experiments to demonstrate the effectiveness of our model on balancing the recommendation quality and user disclosing willingness.
arXiv Detail & Related papers (2022-10-25T04:43:40Z) - Breaking Feedback Loops in Recommender Systems with Causal Inference [99.22185950608838]
Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior.
We propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference.
We show that CAFL improves recommendation quality when compared to prior correction methods.
arXiv Detail & Related papers (2022-07-04T17:58:39Z) - Two-Stage Neural Contextual Bandits for Personalised News Recommendation [50.3750507789989]
Existing personalised news recommendation methods focus on exploiting user interests and ignores exploration in recommendation.
We build on contextual bandits recommendation strategies which naturally address the exploitation-exploration trade-off.
We use deep learning representations for users and news, and generalise the neural upper confidence bound (UCB) policies to generalised additive UCB and bilinear UCB.
arXiv Detail & Related papers (2022-06-26T12:07:56Z) - Improving Long-Term Metrics in Recommendation Systems using
Short-Horizon Offline RL [56.20835219296896]
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility.
We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions.
arXiv Detail & Related papers (2021-06-01T15:58:05Z) - Reward Constrained Interactive Recommendation with Natural Language
Feedback [158.8095688415973]
We propose a novel constraint-augmented reinforcement learning (RL) framework to efficiently incorporate user preferences over time.
Specifically, we leverage a discriminator to detect recommendations violating user historical preference.
Our proposed framework is general and is further extended to the task of constrained text generation.
arXiv Detail & Related papers (2020-05-04T16:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.