AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term
User Engagement
- URL: http://arxiv.org/abs/2310.03984v1
- Date: Fri, 6 Oct 2023 02:45:21 GMT
- Title: AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term
User Engagement
- Authors: Zhenghai Xue, Qingpeng Cai, Tianyou Zuo, Bin Yang, Lantao Hu, Peng
Jiang, Kun Gai, Bo An
- Abstract summary: We introduce a novel paradigm called Adaptive Sequential Recommendation (AdaRec) to address this issue.
AdaRec proposes a new distance-based representation loss to extract latent information from users' interaction trajectories.
We conduct extensive empirical analyses in both simulator-based and live sequential recommendation tasks.
- Score: 25.18963930580529
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Growing attention has been paid to Reinforcement Learning (RL) algorithms
when optimizing long-term user engagement in sequential recommendation tasks.
One challenge in large-scale online recommendation systems is the constant and
complicated changes in users' behavior patterns, such as interaction rates and
retention tendencies. When formulated as a Markov Decision Process (MDP), the
dynamics and reward functions of the recommendation system are continuously
affected by these changes. Existing RL algorithms for recommendation systems
will suffer from distribution shift and struggle to adapt in such an MDP. In
this paper, we introduce a novel paradigm called Adaptive Sequential
Recommendation (AdaRec) to address this issue. AdaRec proposes a new
distance-based representation loss to extract latent information from users'
interaction trajectories. Such information reflects how RL policy fits to
current user behavior patterns, and helps the policy to identify subtle changes
in the recommendation system. To make rapid adaptation to these changes, AdaRec
encourages exploration with the idea of optimism under uncertainty. The
exploration is further guarded by zero-order action optimization to ensure
stable recommendation quality in complicated environments. We conduct extensive
empirical analyses in both simulator-based and live sequential recommendation
tasks, where AdaRec exhibits superior long-term performance compared to all
baseline algorithms.
Related papers
- Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation [51.06031200728449]
We propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation.
Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy.
Results observe significant performance improvement by our method, compared with several well-known baselines.
arXiv Detail & Related papers (2024-09-11T17:01:06Z) - CSRec: Rethinking Sequential Recommendation from A Causal Perspective [25.69446083970207]
The essence of sequential recommender systems (RecSys) lies in understanding how users make decisions.
We propose a novel formulation of sequential recommendation, termed Causal Sequential Recommendation (CSRec)
CSRec aims to predict the probability of a recommended item's acceptance within a sequential context and backtrack how current decisions are made.
arXiv Detail & Related papers (2024-08-23T23:19:14Z) - Adversarial Batch Inverse Reinforcement Learning: Learn to Reward from
Imperfect Demonstration for Interactive Recommendation [23.048841953423846]
We focus on the problem of learning to reward, which is fundamental to reinforcement learning.
Previous approaches either introduce additional procedures for learning to reward, thereby increasing the complexity of optimization.
We propose a novel batch inverse reinforcement learning paradigm that achieves the desired properties.
arXiv Detail & Related papers (2023-10-30T13:43:20Z) - Fisher-Weighted Merge of Contrastive Learning Models in Sequential
Recommendation [0.0]
We are the first to apply the Fisher-Merging method to Sequential Recommendation, addressing and resolving practical challenges associated with it.
We demonstrate the effectiveness of our proposed methods, highlighting their potential to advance the state-of-the-art in sequential learning and recommendation systems.
arXiv Detail & Related papers (2023-07-05T05:58:56Z) - Generative Slate Recommendation with Reinforcement Learning [49.75985313698214]
reinforcement learning algorithms can be used to optimize user engagement in recommender systems.
However, RL approaches are intractable in the slate recommendation scenario.
In that setting, an action corresponds to a slate that may contain any combination of items.
In this work we propose to encode slates in a continuous, low-dimensional latent space learned by a variational auto-encoder.
We are able to (i) relax assumptions required by previous work, and (ii) improve the quality of the action selection by modeling full slates.
arXiv Detail & Related papers (2023-01-20T15:28:09Z) - Breaking Feedback Loops in Recommender Systems with Causal Inference [99.22185950608838]
Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior.
We propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference.
We show that CAFL improves recommendation quality when compared to prior correction methods.
arXiv Detail & Related papers (2022-07-04T17:58:39Z) - D2RLIR : an improved and diversified ranking function in interactive
recommendation systems based on deep reinforcement learning [0.3058685580689604]
This paper proposes a deep reinforcement learning based recommendation system by utilizing Actor-Critic architecture.
The proposed model is able to generate a diverse while relevance recommendation list based on the user's preferences.
arXiv Detail & Related papers (2021-10-28T13:11:29Z) - Improving Long-Term Metrics in Recommendation Systems using
Short-Horizon Offline RL [56.20835219296896]
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility.
We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions.
arXiv Detail & Related papers (2021-06-01T15:58:05Z) - Sequential Recommendation with Self-Attentive Multi-Adversarial Network [101.25533520688654]
We present a Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the effect of context information on sequential recommendation.
Our framework is flexible to incorporate multiple kinds of factor information, and is able to trace how each factor contributes to the recommendation decision over time.
arXiv Detail & Related papers (2020-05-21T12:28:59Z) - Reward Constrained Interactive Recommendation with Natural Language
Feedback [158.8095688415973]
We propose a novel constraint-augmented reinforcement learning (RL) framework to efficiently incorporate user preferences over time.
Specifically, we leverage a discriminator to detect recommendations violating user historical preference.
Our proposed framework is general and is further extended to the task of constrained text generation.
arXiv Detail & Related papers (2020-05-04T16:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.