Related papers: Deep Exploration for Recommendation Systems

Deep Exploration for Recommendation Systems

URL: http://arxiv.org/abs/2109.12509v4
Date: Sun, 30 Jul 2023 08:39:53 GMT
Title: Deep Exploration for Recommendation Systems
Authors: Zheqing Zhu, Benjamin Van Roy
Abstract summary: We develop deep exploration methods for recommendation systems. In particular, we formulate recommendation as a sequential decision problem. Our experiments are carried out with high-fidelity industrial-grade simulators.
Score: 14.937000494745861
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern recommendation systems ought to benefit by probing for and learning from delayed feedback. Research has tended to focus on learning from a user's response to a single recommendation. Such work, which leverages methods of supervised and bandit learning, forgoes learning from the user's subsequent behavior. Where past work has aimed to learn from subsequent behavior, there has been a lack of effective methods for probing to elicit informative delayed feedback. Effective exploration through probing for delayed feedback becomes particularly challenging when rewards are sparse. To address this, we develop deep exploration methods for recommendation systems. In particular, we formulate recommendation as a sequential decision problem and demonstrate benefits of deep exploration over single-step exploration. Our experiments are carried out with high-fidelity industrial-grade simulators and establish large improvements over existing algorithms.

Related papers

Robust Recommender System: A Survey and Future Directions [58.87305602959857]
We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise. We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training for defending against malicious attacks. We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness.
arXiv Detail & Related papers (2023-09-05T08:58:46Z)
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback [22.89046164459011]
We present a technique called Human Guided Exploration (HuGE), which uses low-quality feedback from non-expert users. HuGE guides exploration for reinforcement learning not only in simulation but also in the real world, all without meticulous reward specification.
arXiv Detail & Related papers (2023-07-20T17:30:37Z)
Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation [7.489793155793319]
Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or bad) the decisions made by the learned agent are. In this work intrinsic motivation is used to encourage the agent to explore the environment based on its curiosity, whereas imitation learning allows repeating the most promising experiences to accelerate the learning process.
arXiv Detail & Related papers (2022-11-30T09:18:59Z)
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms. Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward. Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z)
Real-Time Learning from An Expert in Deep Recommendation Systems with Marginal Distance Probability Distribution [1.3535770763481902]
We develop a recommendation system for daily exercise activities to users based on their history, profile and similar users. The developed recommendation system uses a deep recurrent neural network with user-profile attention and temporal attention mechanisms. We propose a real-time, expert-in-the-loop active learning procedure.
arXiv Detail & Related papers (2021-10-12T19:20:18Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Generative Inverse Deep Reinforcement Learning for Online Recommendation [62.09946317831129]
We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation. InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
arXiv Detail & Related papers (2020-11-04T12:12:25Z)
Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z)
Knowledge Transfer via Pre-training for Recommendation: A Review and Prospect [89.91745908462417]
We show the benefits of pre-training to recommender systems through experiments. We discuss several promising directions for future research for recommender systems with pre-training.
arXiv Detail & Related papers (2020-09-19T13:06:27Z)
A Survey on Knowledge Graph-Based Recommender Systems [65.50486149662564]
We conduct a systematical survey of knowledge graph-based recommender systems. We focus on how the papers utilize the knowledge graph for accurate and explainable recommendation. We introduce datasets used in these works.
arXiv Detail & Related papers (2020-02-28T02:26:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.