Exploration in two-stage recommender systems
- URL: http://arxiv.org/abs/2009.08956v1
- Date: Tue, 1 Sep 2020 16:52:51 GMT
- Title: Exploration in two-stage recommender systems
- Authors: Jiri Hron and Karl Krauth and Michael I. Jordan and Niki Kilbertus
- Abstract summary: Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
- Score: 79.50534282841618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-stage recommender systems are widely adopted in industry due to their
scalability and maintainability. These systems produce recommendations in two
steps: (i) multiple nominators preselect a small number of items from a large
pool using cheap-to-compute item embeddings; (ii) with a richer set of
features, a ranker rearranges the nominated items and serves them to the user.
A key challenge of this setup is that optimal performance of each stage in
isolation does not imply optimal global performance. In response to this issue,
Ma et al. (2020) proposed a nominator training objective importance weighted by
the ranker's probability of recommending each item. In this work, we focus on
the complementary issue of exploration. Modeled as a contextual bandit problem,
we find LinUCB (a near optimal exploration strategy for single-stage systems)
may lead to linear regret when deployed in two-stage recommenders. We therefore
propose a method of synchronising the exploration strategies between the ranker
and the nominators. Our algorithm only relies on quantities already computed by
standard LinUCB at each stage and can be implemented in three lines of
additional code. We end by demonstrating the effectiveness of our algorithm
experimentally.
Related papers
- Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations [33.79863762538225]
A key challenge in Recommender systems (RS) is efficiently processing vast item pools to deliver highly personalized recommendations under strict latency constraints.
In this paper, we explore advanced channel fusion strategies by assigning systematically optimized weights to each channel.
Our methods enhance both personalization and flexibility, achieving significant performance improvements across multiple datasets and yielding substantial gains in real-world deployments.
arXiv Detail & Related papers (2024-10-21T14:58:38Z) - A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems [67.52782366565658]
State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
arXiv Detail & Related papers (2024-06-25T07:45:00Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Fairness in the First Stage of Two-Stage Recommender Systems [28.537935838669423]
We investigate how to ensure fairness to the items in large-scale recommender systems.
Existing first-stage recommenders might select an irrecoverably unfair set of candidates.
We propose two threshold-policy selection rules that find near-optimal sets of candidates.
arXiv Detail & Related papers (2022-05-30T21:21:38Z) - Choosing the Best of Both Worlds: Diverse and Novel Recommendations
through Multi-Objective Reinforcement Learning [68.45370492516531]
We introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the Recommender Systems (RS) setting.
SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations.
Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives.
arXiv Detail & Related papers (2021-10-28T13:22:45Z) - On component interactions in two-stage recommender systems [82.38014314502861]
Two-stage recommenders are used by many online platforms, including YouTube, LinkedIn, and Pinterest.
We show that interactions between the ranker and the nominators substantially affect the overall performance.
In particular, using a Mixture-of-Experts approach, we train the nominators to specialize on different subsets of the item pool.
arXiv Detail & Related papers (2021-06-28T20:53:23Z) - Sparse Reward Exploration via Novelty Search and Emitters [55.41644538483948]
We introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm.
SERENE separates the search space exploration and reward exploitation into two alternating processes.
A meta-scheduler allocates a global computational budget by alternating between the two processes.
arXiv Detail & Related papers (2021-02-05T12:34:54Z) - Sample-Rank: Weak Multi-Objective Recommendations Using Rejection
Sampling [0.5156484100374059]
We introduce a method involving multi-goal sampling followed by ranking for user-relevance (Sample-Rank) to nudge recommendations towards multi-objective goals of the marketplace.
The proposed method's novelty is that it reduces the MO recommendation problem to sampling from a desired multi-goal distribution then using it to build a production-friendly learning-to-rank model.
arXiv Detail & Related papers (2020-08-24T09:17:18Z) - Deep Retrieval: Learning A Retrievable Structure for Large-Scale
Recommendations [21.68175843347951]
We present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data.
DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.
arXiv Detail & Related papers (2020-07-12T06:23:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.