Offline Meta-level Model-based Reinforcement Learning Approach for
Cold-Start Recommendation
- URL: http://arxiv.org/abs/2012.02476v1
- Date: Fri, 4 Dec 2020 08:58:35 GMT
- Title: Offline Meta-level Model-based Reinforcement Learning Approach for
Cold-Start Recommendation
- Authors: Yanan Wang, Yong Ge, Li Li, Rui Chen, Tong Xu
- Abstract summary: Reinforcement learning has shown great promise in optimizing long-term user interest in recommender systems.
Existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy.
We propose a meta-level model-based reinforcement learning approach for fast user adaptation.
- Score: 27.17948754183511
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) has shown great promise in optimizing long-term
user interest in recommender systems. However, existing RL-based recommendation
methods need a large number of interactions for each user to learn a robust
recommendation policy. The challenge becomes more critical when recommending to
new users who have a limited number of interactions. To that end, in this
paper, we address the cold-start challenge in the RL-based recommender systems
by proposing a meta-level model-based reinforcement learning approach for fast
user adaptation. In our approach, we learn to infer each user's preference with
a user context variable that enables recommendation systems to better adapt to
new users with few interactions. To improve adaptation efficiency, we learn to
recover the user policy and reward from only a few interactions via an inverse
reinforcement learning method to assist a meta-level recommendation agent.
Moreover, we model the interaction relationship between the user model and
recommendation agent from an information-theoretic perspective. Empirical
results show the effectiveness of the proposed method when adapting to new
users with only a single interaction sequence. We further provide a theoretical
analysis of the recommendation performance bound.
Related papers
- A Model-based Multi-Agent Personalized Short-Video Recommender System [19.03089585214444]
We propose a RL-based industrial short-video recommender ranking framework.
Our proposed framework adopts a model-based learning approach to alleviate the sample selection bias.
Our proposed approach has been deployed in our real large-scale short-video sharing platform.
arXiv Detail & Related papers (2024-05-03T04:34:36Z) - Fisher-Weighted Merge of Contrastive Learning Models in Sequential
Recommendation [0.0]
We are the first to apply the Fisher-Merging method to Sequential Recommendation, addressing and resolving practical challenges associated with it.
We demonstrate the effectiveness of our proposed methods, highlighting their potential to advance the state-of-the-art in sequential learning and recommendation systems.
arXiv Detail & Related papers (2023-07-05T05:58:56Z) - Editable User Profiles for Controllable Text Recommendation [66.00743968792275]
We propose LACE, a novel concept value bottleneck model for controllable text recommendations.
LACE represents each user with a succinct set of human-readable concepts.
It learns personalized representations of the concepts based on user documents.
arXiv Detail & Related papers (2023-04-09T14:52:18Z) - Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform.
Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online.
Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z) - Meta-Learning for Online Update of Recommender Systems [29.69934307878855]
MeLON is a novel online recommender update strategy that supports two-directional flexibility.
MeLON learns how a recommender learns to generate the optimal learning rates for future updates.
arXiv Detail & Related papers (2022-03-19T16:27:30Z) - Learning to Learn a Cold-start Sequential Recommender [70.5692886883067]
The cold-start recommendation is an urgent problem in contemporary online applications.
We propose a meta-learning based cold-start sequential recommendation framework called metaCSR.
metaCSR holds the ability to learn the common patterns from regular users' behaviors.
arXiv Detail & Related papers (2021-10-18T08:11:24Z) - User Tampering in Reinforcement Learning Recommender Systems [2.28438857884398]
We highlight a unique safety concern prevalent in reinforcement learning (RL)-based recommendation algorithms -- 'user tampering'
User tampering is a situation where an RL-based recommender system may manipulate a media user's opinions through its suggestions as part of a policy to maximize long-term user engagement.
arXiv Detail & Related papers (2021-09-09T07:53:23Z) - Generative Inverse Deep Reinforcement Learning for Online Recommendation [62.09946317831129]
We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation.
InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
arXiv Detail & Related papers (2020-11-04T12:12:25Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Reward Constrained Interactive Recommendation with Natural Language
Feedback [158.8095688415973]
We propose a novel constraint-augmented reinforcement learning (RL) framework to efficiently incorporate user preferences over time.
Specifically, we leverage a discriminator to detect recommendations violating user historical preference.
Our proposed framework is general and is further extended to the task of constrained text generation.
arXiv Detail & Related papers (2020-05-04T16:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.