Related papers: Where to Explore: A Reach and Cost-Aware Approach for Unbiased Data Collection in Recommender Systems

Where to Explore: A Reach and Cost-Aware Approach for Unbiased Data Collection in Recommender Systems

URL: http://arxiv.org/abs/2512.14733v1
Date: Thu, 11 Dec 2025 04:03:44 GMT
Title: Where to Explore: A Reach and Cost-Aware Approach for Unbiased Data Collection in Recommender Systems
Authors: Qiang Chen, Venkatesh Ganapati Hegde,
Abstract summary: This paper introduces an approach for delivering content-level exploration safely and efficiently by optimizing its placement based on reach and opportunity cost.<n> Deployed on a large-scale streaming platform with over 100 million monthly active users, our approach identifies scroll-depth regions with lower engagement.<n>Our method complements existing intra-row diversification and bandit-based exploration techniques by introducing a deployable, behaviorally informed mechanism for surfacing exploratory content at scale.
Score: 4.013613514540094
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Exploration is essential to improve long-term recommendation quality, but it often degrades short-term business performance, especially in remote-first TV environments where users engage passively, expect instant relevance, and offer few chances for correction. This paper introduces an approach for delivering content-level exploration safely and efficiently by optimizing its placement based on reach and opportunity cost. Deployed on a large-scale streaming platform with over 100 million monthly active users, our approach identifies scroll-depth regions with lower engagement and strategically introduces a dedicated container, the "Something Completely Different" row containing randomized content. Rather than enforcing exploration uniformly across the user interface (UI), we condition its appearance on empirically low-cost, high-reach positions to ensure minimal tradeoff against platform-level watch time goals. Extensive A/B testing shows that this strategy preserves business metrics while collecting unbiased interaction data. Our method complements existing intra-row diversification and bandit-based exploration techniques by introducing a deployable, behaviorally informed mechanism for surfacing exploratory content at scale. Moreover, we demonstrate that the collected unbiased data, integrated into downstream candidate generation, significantly improves user engagement, validating its value for recommender systems.

Related papers

Tree of Preferences for Diversified Recommendation [54.183647833064136]
We study diversified recommendation from a data-bias perspective.<n>Inspired by the outstanding performance of large language models (LLMs) in zero-shot inference leveraging world knowledge, we propose a novel approach.
arXiv Detail & Related papers (2025-12-24T04:13:17Z)
Retentive Relevance: Capturing Long-Term User Value in Recommendation Systems [29.596401271139797]
We introduce Retentive Relevance, a novel content-level survey-based feedback measure.<n>Retentive Relevance directly assesses users' intent to return to the platform for similar content.<n>We show that Retentive Relevance significantly outperforms both engagement signals and other survey measures in predicting next-day retention.
arXiv Detail & Related papers (2025-10-08T23:38:57Z)
RecGPT Technical Report [57.84251629878726]
We propose RecGPT, a next-generation framework that places user intent at the center of the recommendation pipeline.<n> RecGPT integrates large language models into key stages of user interest mining, item retrieval, and explanation generation.<n>Online experiments demonstrate that RecGPT achieves consistent performance gains across stakeholders.
arXiv Detail & Related papers (2025-07-30T17:55:06Z)
Exploration on Demand: From Algorithmic Control to User Empowerment [0.0]
This paper introduces an adaptive clustering framework with user-controlled exploration that effectively balances personalization and diversity in movie recommendations.<n>We propose a novel exploration mechanism that empowers users to control recommendation diversity by strategically sampling from less-engaged clusters.<n>Our Large Language Model-based A/B testing methodology, conducted with 300 simulated users, reveals that 72.7% of long-term users prefer exploratory recommendations over purely exploitative ones.
arXiv Detail & Related papers (2025-07-29T14:57:26Z)
Optimization of Epsilon-Greedy Exploration [35.9674180611893]
We show that variations in the batch size across periods significantly influence the optimal exploration strategy.<n>Our methods automatically calibrate exploration to the specific problem setting, consistently matching or outperforming the best for each setting.
arXiv Detail & Related papers (2025-06-03T19:14:53Z)
Item Level Exploration Traffic Allocation in Large-scale Recommendation Systems [7.207863744953401]
This paper contributes to addressing the item cold start problem in large-scale recommender systems.<n>We propose an exploration system designed to efficiently allocate impressions to fresh items.
arXiv Detail & Related papers (2025-05-14T00:05:04Z)
Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms [68.51708490104687]
We show that a purely relevance-driven policy with low exploration strength boosts short-term user satisfaction but undermines the long-term richness of the content pool. Our findings reveal a fundamental trade-off between immediate user satisfaction and overall content production on platforms.
arXiv Detail & Related papers (2024-10-31T07:19:22Z)
Robust Recommender System: A Survey and Future Directions [58.87305602959857]
We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise.<n>We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training for defending against malicious attacks.<n>We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness.
arXiv Detail & Related papers (2023-09-05T08:58:46Z)
PIE: Personalized Interest Exploration for Large-Scale Recommender Systems [0.0]
We present a framework for exploration in large-scale recommender systems to address these challenges. Our methodology can be easily integrated into an existing large-scale recommender system with minimal modifications. Our work has been deployed in production on Facebook Watch, a popular video discovery and sharing platform serving billions of users.
arXiv Detail & Related papers (2023-04-13T22:25:09Z)
Personalizing Intervened Network for Long-tailed Sequential User Behavior Modeling [66.02953670238647]
Tail users suffer from significantly lower-quality recommendation than the head users after joint training. A model trained on tail users separately still achieve inferior results due to limited data. We propose a novel approach that significantly improves the recommendation performance of the tail users.
arXiv Detail & Related papers (2022-08-19T02:50:19Z)
Incentivizing Exploration in Linear Bandits under Information Gap [50.220743323750035]
We study the problem of incentivizing exploration for myopic users in linear bandits. In order to maximize the long-term reward, the system offers compensation to incentivize the users to pull the exploratory arms.
arXiv Detail & Related papers (2021-04-08T16:01:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.