Sim2Rec: A Simulator-based Decision-making Approach to Optimize
Real-World Long-term User Engagement in Sequential Recommender Systems
- URL: http://arxiv.org/abs/2305.04832v1
- Date: Wed, 3 May 2023 19:21:25 GMT
- Title: Sim2Rec: A Simulator-based Decision-making Approach to Optimize
Real-World Long-term User Engagement in Sequential Recommender Systems
- Authors: Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie
Shang, Jieping Ye, Chen Ma
- Abstract summary: Long-term user engagement (LTE) optimization in sequential recommender systems (SRS) is suited by reinforcement learning (RL)
RL has its shortcomings, particularly requiring a large number of online samples for exploration.
We present a simulator-based recommender policy training approach, Simulation-to-Recommendation (Sim2Rec)
- Score: 43.31078296862647
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Long-term user engagement (LTE) optimization in sequential recommender
systems (SRS) is shown to be suited by reinforcement learning (RL) which finds
a policy to maximize long-term rewards. Meanwhile, RL has its shortcomings,
particularly requiring a large number of online samples for exploration, which
is risky in real-world applications. One of the appealing ways to avoid the
risk is to build a simulator and learn the optimal recommendation policy in the
simulator. In LTE optimization, the simulator is to simulate multiple users'
daily feedback for given recommendations. However, building a user simulator
with no reality-gap, i.e., can predict user's feedback exactly, is unrealistic
because the users' reaction patterns are complex and historical logs for each
user are limited, which might mislead the simulator-based recommendation
policy. In this paper, we present a practical simulator-based recommender
policy training approach, Simulation-to-Recommendation (Sim2Rec) to handle the
reality-gap problem for LTE optimization. Specifically, Sim2Rec introduces a
simulator set to generate various possibilities of user behavior patterns, then
trains an environment-parameter extractor to recognize users' behavior patterns
in the simulators. Finally, a context-aware policy is trained to make the
optimal decisions on all of the variants of the users based on the inferred
environment-parameters. The policy is transferable to unseen environments
(e.g., the real world) directly as it has learned to recognize all various user
behavior patterns and to make the correct decisions based on the inferred
environment-parameters. Experiments are conducted in synthetic environments and
a real-world large-scale ride-hailing platform, DidiChuxing. The results show
that Sim2Rec achieves significant performance improvement, and produces robust
recommendations in unseen environments.
Related papers
- A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems [14.646529557978512]
Conversational Recommender System (CRS) leverages real-time feedback from users to dynamically model their preferences.
Large Language Models (LLMs) has marked the onset of a new epoch in computational capabilities.
We introduce a Controllable, scalable, and human-Involved (CSHI) simulator framework that manages the behavior of user simulators.
arXiv Detail & Related papers (2024-05-13T03:02:56Z) - How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation [14.646529557978512]
We analyze the limitations of using Large Language Models in constructing user simulators for Conversational Recommender System.
Data leakage, which occurs in conversational history and the user simulator's replies, results in inflated evaluation results.
We propose SimpleUserSim, employing a straightforward strategy to guide the topic toward the target items.
arXiv Detail & Related papers (2024-03-25T04:21:06Z) - USimAgent: Large Language Models for Simulating Search Users [33.17004578463697]
Large Language Models (LLMs) have demonstrated remarked potential in simulating human-level intelligence.
In this paper, we introduce a LLM-based user search behavior simulator, USimAgent.
The proposed simulator can simulate users' querying, clicking, and stopping behaviors during search, and thus, is capable of generating complete search sessions.
arXiv Detail & Related papers (2024-03-14T07:40:54Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - Metaphorical User Simulators for Evaluating Task-oriented Dialogue
Systems [80.77917437785773]
Task-oriented dialogue systems ( TDSs) are assessed mainly in an offline setting or through human evaluation.
We propose a metaphorical user simulator for end-to-end TDS evaluation, where we define a simulator to be metaphorical if it simulates user's analogical thinking in interactions with systems.
We also propose a tester-based evaluation framework to generate variants, i.e., dialogue systems with different capabilities.
arXiv Detail & Related papers (2022-04-02T05:11:03Z) - Off Environment Evaluation Using Convex Risk Minimization [0.0]
We propose a convex risk minimization algorithm to estimate the model mismatch between the simulator and the target domain.
We show that this estimator can be used along with the simulator to evaluate performance of an RL agents in the target domain.
arXiv Detail & Related papers (2021-12-21T21:31:54Z) - Auto-Tuned Sim-to-Real Transfer [143.44593793640814]
Policies trained in simulation often fail when transferred to the real world.
Current approaches to tackle this problem, such as domain randomization, require prior knowledge and engineering.
We propose a method for automatically tuning simulator system parameters to match the real world.
arXiv Detail & Related papers (2021-04-15T17:59:55Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world.
We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms.
Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.