User Tampering in Reinforcement Learning Recommender Systems
- URL: http://arxiv.org/abs/2109.04083v3
- Date: Mon, 24 Jul 2023 14:19:55 GMT
- Title: User Tampering in Reinforcement Learning Recommender Systems
- Authors: Charles Evans, Atoosa Kasirzadeh
- Abstract summary: We highlight a unique safety concern prevalent in reinforcement learning (RL)-based recommendation algorithms -- 'user tampering'
User tampering is a situation where an RL-based recommender system may manipulate a media user's opinions through its suggestions as part of a policy to maximize long-term user engagement.
- Score: 2.28438857884398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce new formal methods and provide empirical evidence
to highlight a unique safety concern prevalent in reinforcement learning
(RL)-based recommendation algorithms -- 'user tampering.' User tampering is a
situation where an RL-based recommender system may manipulate a media user's
opinions through its suggestions as part of a policy to maximize long-term user
engagement. We use formal techniques from causal modeling to critically analyze
prevailing solutions proposed in the literature for implementing scalable
RL-based recommendation systems, and we observe that these methods do not
adequately prevent user tampering. Moreover, we evaluate existing mitigation
strategies for reward tampering issues, and show that these methods are
insufficient in addressing the distinct phenomenon of user tampering within the
context of recommendations. We further reinforce our findings with a simulation
study of an RL-based recommendation system focused on the dissemination of
political content. Our study shows that a Q-learning algorithm consistently
learns to exploit its opportunities to polarize simulated users with its early
recommendations in order to have more consistent success with subsequent
recommendations that align with this induced polarization. Our findings
emphasize the necessity for developing safer RL-based recommendation systems
and suggest that achieving such safety would require a fundamental shift in the
design away from the approaches we have seen in the recent literature.
Related papers
- Fisher-Weighted Merge of Contrastive Learning Models in Sequential
Recommendation [0.0]
We are the first to apply the Fisher-Merging method to Sequential Recommendation, addressing and resolving practical challenges associated with it.
We demonstrate the effectiveness of our proposed methods, highlighting their potential to advance the state-of-the-art in sequential learning and recommendation systems.
arXiv Detail & Related papers (2023-07-05T05:58:56Z) - Breaking Feedback Loops in Recommender Systems with Causal Inference [99.22185950608838]
Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior.
We propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference.
We show that CAFL improves recommendation quality when compared to prior correction methods.
arXiv Detail & Related papers (2022-07-04T17:58:39Z) - A Review on Pushing the Limits of Baseline Recommendation Systems with
the integration of Opinion Mining & Information Retrieval Techniques [0.0]
Recommendation Systems allow users to identify trending items among a community while being timely and relevant to the user's expectations.
Deep Learning methods have been brought forward to achieve better quality recommendations.
Researchers have tried to expand on the capabilities of standard recommendation systems to provide the most effective recommendations.
arXiv Detail & Related papers (2022-05-03T22:13:33Z) - CausPref: Causal Preference Learning for Out-of-Distribution
Recommendation [36.22965012642248]
The current recommender system is still vulnerable to the distribution shift of users and items in realistic scenarios.
We propose to incorporate the recommendation-specific DAG learner into a novel causal preference-based recommendation framework named CausPref.
Our approach surpasses the benchmark models significantly under types of out-of-distribution settings.
arXiv Detail & Related papers (2022-02-08T16:42:03Z) - Supervised Advantage Actor-Critic for Recommender Systems [76.7066594130961]
We propose negative sampling strategy for training the RL component and combine it with supervised sequential learning.
Based on sampled (negative) actions (items), we can calculate the "advantage" of a positive action over the average case.
We instantiate SNQN and SA2C with four state-of-the-art sequential recommendation models and conduct experiments on two real-world datasets.
arXiv Detail & Related papers (2021-11-05T12:51:15Z) - Improving Long-Term Metrics in Recommendation Systems using
Short-Horizon Offline RL [56.20835219296896]
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility.
We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions.
arXiv Detail & Related papers (2021-06-01T15:58:05Z) - Offline Meta-level Model-based Reinforcement Learning Approach for
Cold-Start Recommendation [27.17948754183511]
Reinforcement learning has shown great promise in optimizing long-term user interest in recommender systems.
Existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy.
We propose a meta-level model-based reinforcement learning approach for fast user adaptation.
arXiv Detail & Related papers (2020-12-04T08:58:35Z) - Knowledge Transfer via Pre-training for Recommendation: A Review and
Prospect [89.91745908462417]
We show the benefits of pre-training to recommender systems through experiments.
We discuss several promising directions for future research for recommender systems with pre-training.
arXiv Detail & Related papers (2020-09-19T13:06:27Z) - Reinforcement Learning for Strategic Recommendations [32.73903761398027]
Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business.
At Adobe research, we have been implementing such systems for various use-cases, including points of interest recommendations, tutorial recommendations, next step guidance in multi-media editing software, and ad recommendation for optimizing lifetime value.
There are many research challenges when building these systems, such as modeling the sequential behavior of users, deciding when to intervene and offer recommendations without annoying the user, evaluating policies offline with
arXiv Detail & Related papers (2020-09-15T20:45:48Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Reward Constrained Interactive Recommendation with Natural Language
Feedback [158.8095688415973]
We propose a novel constraint-augmented reinforcement learning (RL) framework to efficiently incorporate user preferences over time.
Specifically, we leverage a discriminator to detect recommendations violating user historical preference.
Our proposed framework is general and is further extended to the task of constrained text generation.
arXiv Detail & Related papers (2020-05-04T16:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.