Generative Inverse Deep Reinforcement Learning for Online Recommendation
- URL: http://arxiv.org/abs/2011.02248v1
- Date: Wed, 4 Nov 2020 12:12:25 GMT
- Title: Generative Inverse Deep Reinforcement Learning for Online Recommendation
- Authors: Xiaocong Chen and Lina Yao and Aixin Sun and Xianzhi Wang and Xiwei Xu
and Liming Zhu
- Abstract summary: We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation.
InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
- Score: 62.09946317831129
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning enables an agent to capture user's interest
through interactions with the environment dynamically. It has attracted great
interest in the recommendation research. Deep reinforcement learning uses a
reward function to learn user's interest and to control the learning process.
However, most reward functions are manually designed; they are either
unrealistic or imprecise to reflect the high variety, dimensionality, and
non-linearity properties of the recommendation problem. That makes it difficult
for the agent to learn an optimal policy to generate the most satisfactory
recommendations. To address the above issue, we propose a novel generative
inverse reinforcement learning approach, namely InvRec, which extracts the
reward function from user's behaviors automatically, for online recommendation.
We conduct experiments on an online platform, VirtualTB, and compare with
several state-of-the-art methods to demonstrate the feasibility and
effectiveness of our proposed approach.
Related papers
- RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Deep Exploration for Recommendation Systems [14.937000494745861]
We develop deep exploration methods for recommendation systems.
In particular, we formulate recommendation as a sequential decision problem.
Our experiments are carried out with high-fidelity industrial-grade simulators.
arXiv Detail & Related papers (2021-09-26T06:54:26Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Interaction-Grounded Learning [24.472306647094253]
We propose Interaction-Grounded Learning, in which a learner's goal is to interact with the environment with no grounding or explicit reward to optimize its policies.
We show that in an Interaction-Grounded Learning setting, with certain natural assumptions, a learner can discover the latent reward and ground its policy for successful interaction.
arXiv Detail & Related papers (2021-06-09T08:13:29Z) - Generative Adversarial Reward Learning for Generalized Behavior Tendency
Inference [71.11416263370823]
We propose a generative inverse reinforcement learning for user behavioral preference modelling.
Our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN.
arXiv Detail & Related papers (2021-05-03T13:14:25Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Knowledge-guided Deep Reinforcement Learning for Interactive
Recommendation [49.32287384774351]
Interactive recommendation aims to learn from dynamic interactions between items and users to achieve responsiveness and accuracy.
We propose Knowledge-Guided deep Reinforcement learning to harness the advantages of both reinforcement learning and knowledge graphs for interactive recommendation.
arXiv Detail & Related papers (2020-04-17T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.