Related papers: UOEP: User-Oriented Exploration Policy for Enhancing Long-Term User Experiences in Recommender Systems

UOEP: User-Oriented Exploration Policy for Enhancing Long-Term User Experiences in Recommender Systems

URL: http://arxiv.org/abs/2401.09034v2
Date: Wed, 22 May 2024 01:01:11 GMT
Title: UOEP: User-Oriented Exploration Policy for Enhancing Long-Term User Experiences in Recommender Systems
Authors: Changshuo Zhang, Sirui Chen, Xiao Zhang, Sunhao Dai, Weijie Yu, Jun Xu,
Abstract summary: Reinforcement learning (RL) has gained traction for enhancing user long-term experiences in recommender systems. Modern recommender systems exhibit distinct user behavioral patterns among tens of millions of items, which increases the difficulty of exploration. We propose User-Oriented Exploration Policy (UOEP), a novel approach facilitating fine-grained exploration among user groups.
Score: 7.635117537731915
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) has gained traction for enhancing user long-term experiences in recommender systems by effectively exploring users' interests. However, modern recommender systems exhibit distinct user behavioral patterns among tens of millions of items, which increases the difficulty of exploration. For example, user behaviors with different activity levels require varying intensity of exploration, while previous studies often overlook this aspect and apply a uniform exploration strategy to all users, which ultimately hurts user experiences in the long run. To address these challenges, we propose User-Oriented Exploration Policy (UOEP), a novel approach facilitating fine-grained exploration among user groups. We first construct a distributional critic which allows policy optimization under varying quantile levels of cumulative reward feedbacks from users, representing user groups with varying activity levels. Guided by this critic, we devise a population of distinct actors aimed at effective and fine-grained exploration within its respective user group. To simultaneously enhance diversity and stability during the exploration process, we further introduce a population-level diversity regularization term and a supervision module. Experimental results on public recommendation datasets demonstrate that our approach outperforms all other baselines in terms of long-term performance, validating its user-oriented exploration effectiveness. Meanwhile, further analyses reveal our approach's benefits of improved performance for low-activity users as well as increased fairness among users.

Related papers

RecGPT Technical Report [57.84251629878726]
We propose RecGPT, a next-generation framework that places user intent at the center of the recommendation pipeline.<n> RecGPT integrates large language models into key stages of user interest mining, item retrieval, and explanation generation.<n>Online experiments demonstrate that RecGPT achieves consistent performance gains across stakeholders.
arXiv Detail & Related papers (2025-07-30T17:55:06Z)
Exploration on Demand: From Algorithmic Control to User Empowerment [0.0]
This paper introduces an adaptive clustering framework with user-controlled exploration that effectively balances personalization and diversity in movie recommendations.<n>We propose a novel exploration mechanism that empowers users to control recommendation diversity by strategically sampling from less-engaged clusters.<n>Our Large Language Model-based A/B testing methodology, conducted with 300 simulated users, reveals that 72.7% of long-term users prefer exploratory recommendations over purely exploitative ones.
arXiv Detail & Related papers (2025-07-29T14:57:26Z)
Thought-Augmented Planning for LLM-Powered Interactive Recommender Agent [56.61028117645315]
We propose a novel thought-augmented interactive recommender agent system (TAIRA) that addresses complex user intents through distilled thought patterns.<n>Specifically, TAIRA is designed as an LLM-powered multi-agent system featuring a manager agent that orchestrates recommendation tasks by decomposing user needs and planning subtasks.<n>Through comprehensive experiments conducted across multiple datasets, TAIRA exhibits significantly enhanced performance compared to existing methods.
arXiv Detail & Related papers (2025-06-30T03:15:50Z)
Compositions of Variant Experts for Integrating Short-Term and Long-Term Preferences [22.769456275892477]
We propose a framework that combines short- and long-term preferences to enhance recommendation performance.<n>This novel framework dynamically integrates short- and long-term preferences through the use of different specialized recommendation models.
arXiv Detail & Related papers (2025-06-29T10:09:33Z)
Multi-agents based User Values Mining for Recommendation [52.26100802380767]
We propose a zero-shot multi-LLM collaborative framework for effective and accurate user value extraction.<n>We apply text summarization techniques to condense item content while preserving essential meaning.<n>To mitigate hallucinations, we introduce two specialized agent roles: evaluators and supervisors.
arXiv Detail & Related papers (2025-05-02T04:01:31Z)
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems [2.0378554336804013]
Stratified Expert Cloning (SEC) is a novel imitation learning framework that effectively leverages logged data from high-retention users to learn robust recommendation policies. SEC introduces three key innovations: 1) a multi-level expert stratification strategy that captures the nuances in expert user behaviors at different retention levels; 2) an adaptive expert selection mechanism that dynamically assigns users to the most suitable policy based on their current state and historical retention level; and 3) an action entropy regularization technique that promotes recommendation diversity and mitigates the risk of policy collapse.
arXiv Detail & Related papers (2025-04-08T03:10:42Z)
Interactive Visualization Recommendation with Hier-SUCB [52.11209329270573]
We propose an interactive personalized visualization recommendation (PVisRec) system that learns on user feedback from previous interactions. For more interactive and accurate recommendations, we propose Hier-SUCB, a contextual semi-bandit in the PVisRec setting.
arXiv Detail & Related papers (2025-02-05T17:14:45Z)
Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms [68.51708490104687]
We show that a purely relevance-driven policy with low exploration strength boosts short-term user satisfaction but undermines the long-term richness of the content pool. Our findings reveal a fundamental trade-off between immediate user satisfaction and overall content production on platforms.
arXiv Detail & Related papers (2024-10-31T07:19:22Z)
Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z)
Negative Sampling in Recommendation: A Survey and Future Directions [43.11318243903388]
Negative sampling is proficients in revealing the genuine negative aspect inherent in user behaviors. We conduct an extensive literature review on the existing negative sampling strategies in recommendation. We detail the insights of the tailored negative sampling strategies in diverse recommendation scenarios.
arXiv Detail & Related papers (2024-09-11T12:48:52Z)
Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content [66.71102704873185]
We test for user strategization by conducting a lab experiment and survey. We find strong evidence of strategization across outcome metrics, including participants' dwell time and use of "likes" Our findings suggest that platforms cannot ignore the effect of their algorithms on user behavior.
arXiv Detail & Related papers (2024-05-09T07:36:08Z)
PIE: Personalized Interest Exploration for Large-Scale Recommender Systems [0.0]
We present a framework for exploration in large-scale recommender systems to address these challenges. Our methodology can be easily integrated into an existing large-scale recommender system with minimal modifications. Our work has been deployed in production on Facebook Watch, a popular video discovery and sharing platform serving billions of users.
arXiv Detail & Related papers (2023-04-13T22:25:09Z)
PARSRec: Explainable Personalized Attention-fused Recurrent Sequential Recommendation Using Session Partial Actions [0.5801044612920815]
We propose an architecture that relies on common patterns as well as individual behaviors to tailor its recommendations for each person. Our empirical results on Nielsen Consumer Panel dataset indicate that the proposed approach achieves up to 27.9% performance improvement.
arXiv Detail & Related papers (2022-09-16T12:07:43Z)
Personalizing Intervened Network for Long-tailed Sequential User Behavior Modeling [66.02953670238647]
Tail users suffer from significantly lower-quality recommendation than the head users after joint training. A model trained on tail users separately still achieve inferior results due to limited data. We propose a novel approach that significantly improves the recommendation performance of the tail users.
arXiv Detail & Related papers (2022-08-19T02:50:19Z)
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation. In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor. Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z)
An Empirical analysis on Transparent Algorithmic Exploration in Recommender Systems [17.91522677924348]
We propose a new approach for feedback elicitation without any deception and compare our approach to the conventional mix-in approach for evaluation. Our results indicated that users left significantly more feedback on items chosen for exploration with our interface.
arXiv Detail & Related papers (2021-07-31T05:08:29Z)
Exploration-Exploitation Motivated Variational Auto-Encoder for Recommender Systems [1.52292571922932]
We introduce an exploitation-exploration motivated variational auto-encoder (XploVAE) to collaborative filtering. To facilitate personalized recommendations, we construct user-specific subgraphs, which contain the first-order proximity capturing observed user-item interactions. A hierarchical latent space model is utilized to learn the personalized item embedding for a given user, along with the population distribution of all user subgraphs.
arXiv Detail & Related papers (2020-06-05T17:37:46Z)
Empowering Active Learning to Jointly Optimize System and User Demands [70.66168547821019]
We propose a new active learning approach that jointly optimize the active learning system (training efficiently) and the user (receiving useful instances) We study our approach in an educational application, which particularly benefits from this technique as the system needs to rapidly learn to predict the appropriateness of an exercise to a particular user. We evaluate multiple learning strategies and user types with data from real users and find that our joint approach better satisfies both objectives when alternative methods lead to many unsuitable exercises for end users.
arXiv Detail & Related papers (2020-05-09T16:02:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.