Lusifer: LLM-based User SImulated Feedback Environment for online Recommender systems
- URL: http://arxiv.org/abs/2405.13362v1
- Date: Wed, 22 May 2024 05:43:15 GMT
- Title: Lusifer: LLM-based User SImulated Feedback Environment for online Recommender systems
- Authors: Danial Ebrat, Luis Rueda,
- Abstract summary: Lusifer is a novel environment that generates simulated user feedback.
It synthesizes user profiles and interaction histories to simulate responses and behaviors toward recommended items.
Using the MovieLens100K dataset as proof of concept, Lusifer demonstrates accurate emulation of user behavior and preferences.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training reinforcement learning-based recommender systems are often hindered by the lack of dynamic and realistic user interactions. Lusifer, a novel environment leveraging Large Language Models (LLMs), addresses this limitation by generating simulated user feedback. It synthesizes user profiles and interaction histories to simulate responses and behaviors toward recommended items. In addition, user profiles are updated after each rating to reflect evolving user characteristics. Using the MovieLens100K dataset as proof of concept, Lusifer demonstrates accurate emulation of user behavior and preferences. This paper presents Lusifer's operational pipeline, including prompt generation and iterative user profile updates. While validating Lusifer's ability to produce realistic dynamic feedback, future research could utilize this environment to train reinforcement learning systems, offering a scalable and adjustable framework for user simulation in online recommender systems.
Related papers
- WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback [28.317315761271804]
We introduce WildFeedback, a novel framework that leverages real-time, in-situ user interactions to create preference datasets that more accurately reflect authentic human values.
We apply this framework to a large corpus of user-LLM conversations, resulting in a rich preference dataset that reflects genuine user preferences.
Our experiments demonstrate that LLMs fine-tuned on WildFeedback exhibit significantly improved alignment with user preferences.
arXiv Detail & Related papers (2024-08-28T05:53:46Z) - A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems [14.646529557978512]
Conversational Recommender System (CRS) leverages real-time feedback from users to dynamically model their preferences.
Large Language Models (LLMs) has marked the onset of a new epoch in computational capabilities.
We introduce a Controllable, scalable, and human-Involved (CSHI) simulator framework that manages the behavior of user simulators.
arXiv Detail & Related papers (2024-05-13T03:02:56Z) - Learning Social Graph for Inactive User Recommendation [50.090904659803854]
LSIR learns an optimal social graph structure for social recommendation, especially for inactive users.
Experiments on real-world datasets demonstrate that LSIR achieves significant improvements of up to 129.58% on NDCG in inactive user recommendation.
arXiv Detail & Related papers (2024-05-08T03:40:36Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z) - Sim2Rec: A Simulator-based Decision-making Approach to Optimize
Real-World Long-term User Engagement in Sequential Recommender Systems [43.31078296862647]
Long-term user engagement (LTE) optimization in sequential recommender systems (SRS) is suited by reinforcement learning (RL)
RL has its shortcomings, particularly requiring a large number of online samples for exploration.
We present a simulator-based recommender policy training approach, Simulation-to-Recommendation (Sim2Rec)
arXiv Detail & Related papers (2023-05-03T19:21:25Z) - PUNR: Pre-training with User Behavior Modeling for News Recommendation [26.349183393252115]
News recommendation aims to predict click behaviors based on user behaviors.
How to effectively model the user representations is the key to recommending preferred news.
We propose an unsupervised pre-training paradigm with two tasks, i.e. user behavior masking and user behavior generation.
arXiv Detail & Related papers (2023-04-25T08:03:52Z) - Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform.
Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online.
Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z) - Simulating Bandit Learning from User Feedback for Extractive Question
Answering [51.97943858898579]
We study learning from user feedback for extractive question answering by simulating feedback using supervised data.
We show that systems initially trained on a small number of examples can dramatically improve given feedback from users on model-predicted answers.
arXiv Detail & Related papers (2022-03-18T17:47:58Z) - Improving Conversational Question Answering Systems after Deployment
using Feedback-Weighted Learning [69.42679922160684]
We propose feedback-weighted learning based on importance sampling to improve upon an initial supervised system using binary user feedback.
Our work opens the prospect to exploit interactions with real users and improve conversational systems after deployment.
arXiv Detail & Related papers (2020-11-01T19:50:34Z) - User Memory Reasoning for Conversational Recommendation [68.34475157544246]
We study a conversational recommendation model which dynamically manages users' past (offline) preferences and current (online) requests.
MGConvRex captures human-level reasoning over user memory and has disjoint training/testing sets of users for zero-shot (cold-start) reasoning for recommendation.
arXiv Detail & Related papers (2020-05-30T05:29:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.