Related papers: Reward Shaping for User Satisfaction in a REINFORCE Recommender

Reward Shaping for User Satisfaction in a REINFORCE Recommender

URL: http://arxiv.org/abs/2209.15166v1
Date: Fri, 30 Sep 2022 01:29:12 GMT
Title: Reward Shaping for User Satisfaction in a REINFORCE Recommender
Authors: Konstantina Christakopoulou, Can Xu, Sai Zhang, Sriraj Badam, Trevor Potter, Daniel Li, Hao Wan, Xinyang Yi, Ya Le, Chris Berg, Eric Bencomo Dixon, Ed H. Chi, Minmin Chen
Abstract summary: We propose a policy network and a satisfaction imputation network to learn which actions are satisfying to the user. The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend.
Score: 24.65853598093849
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction? Three research questions are key: (1) measuring user satisfaction, (2) combatting sparsity of satisfaction signals, and (3) adapting the training of the recommender agent to maximize satisfaction. For measurement, it has been found that surveys explicitly asking users to rate their experience with consumed items can provide valuable orthogonal information to the engagement/interaction data, acting as a proxy to the underlying user satisfaction. For sparsity, i.e, only being able to observe how satisfied users are with a tiny fraction of user-item interactions, imputation models can be useful in predicting satisfaction level for all items users have consumed. For learning satisfying recommender policies, we postulate that reward shaping in RL recommender agents is powerful for driving satisfying user experiences. Putting everything together, we propose to jointly learn a policy network and a satisfaction imputation network: The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend, with the reward utilizing the imputed satisfaction. We use both offline analysis and live experiments in an industrial large-scale recommendation platform to demonstrate the promise of our approach for satisfying user experiences.

Related papers

LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue [5.070104802923903]
PRAISE is an interpretable framework for effective user satisfaction prediction. It operates through three key modules. It achieves state-of-the-art performance on three benchmarks for the User Satisfaction Estimation task.
arXiv Detail & Related papers (2025-03-06T18:12:33Z)
Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms [68.51708490104687]
We show that a purely relevance-driven policy with low exploration strength boosts short-term user satisfaction but undermines the long-term richness of the content pool. Our findings reveal a fundamental trade-off between immediate user satisfaction and overall content production on platforms.
arXiv Detail & Related papers (2024-10-31T07:19:22Z)
Proactive Recommendation in Social Networks: Steering User Interest via Neighbor Influence [54.13541697801396]
We propose a new task named Proactive Recommendation in Social Networks (PRSN) PRSN indirectly steers users' interest by utilizing the influence of social neighbors. We propose a Neighbor Interference Recommendation (NIRec) framework with two key modules.
arXiv Detail & Related papers (2024-09-13T15:53:40Z)
Modeling User Retention through Generative Flow Networks [34.74982897470852]
Flow-based modeling technique can back-propagate the retention reward towards each recommended item in the user session. We show that the flow combined with traditional learning-to-rank objectives eventually optimized a non-discounted cumulative reward for both immediate user feedback and user retention.
arXiv Detail & Related papers (2024-06-10T06:22:18Z)
Interactive Garment Recommendation with User in the Loop [77.35411131350833]
We propose to build a user profile on the fly by integrating user reactions as we recommend complementary items to compose an outfit. We present a reinforcement learning agent capable of suggesting appropriate garments and ingesting user feedback to improve its recommendations.
arXiv Detail & Related papers (2024-02-18T16:01:28Z)
PIE: Personalized Interest Exploration for Large-Scale Recommender Systems [0.0]
We present a framework for exploration in large-scale recommender systems to address these challenges. Our methodology can be easily integrated into an existing large-scale recommender system with minimal modifications. Our work has been deployed in production on Facebook Watch, a popular video discovery and sharing platform serving billions of users.
arXiv Detail & Related papers (2023-04-13T22:25:09Z)
Editable User Profiles for Controllable Text Recommendation [66.00743968792275]
We propose LACE, a novel concept value bottleneck model for controllable text recommendations. LACE represents each user with a succinct set of human-readable concepts. It learns personalized representations of the concepts based on user documents.
arXiv Detail & Related papers (2023-04-09T14:52:18Z)
Recommending to Strategic Users [10.079698681921673]
We show that users strategically choose content to influence the types of content they get recommended in the future. We propose three interventions that may improve recommendation quality when taking into account strategic consumption.
arXiv Detail & Related papers (2023-02-13T17:57:30Z)
Personalizing Intervened Network for Long-tailed Sequential User Behavior Modeling [66.02953670238647]
Tail users suffer from significantly lower-quality recommendation than the head users after joint training. A model trained on tail users separately still achieve inferior results due to limited data. We propose a novel approach that significantly improves the recommendation performance of the tail users.
arXiv Detail & Related papers (2022-08-19T02:50:19Z)
Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities [34.288256311920904]
We build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the provider associated with the recommended content. We offer a number of simulated experiments that shed light on both the benefits and the limitations of our approach.
arXiv Detail & Related papers (2021-05-06T00:02:58Z)
Generative Inverse Deep Reinforcement Learning for Online Recommendation [62.09946317831129]
We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation. InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
arXiv Detail & Related papers (2020-11-04T12:12:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.