Regret in Online Recommendation Systems
- URL: http://arxiv.org/abs/2010.12363v1
- Date: Fri, 23 Oct 2020 12:48:35 GMT
- Title: Regret in Online Recommendation Systems
- Authors: Kaito Ariu, Narae Ryu, Se-Young Yun, Alexandre Prouti\`ere
- Abstract summary: This paper proposes a theoretical analysis of recommendation systems in an online setting, where items are sequentially recommended to users over time.
In each round, a user, randomly picked from a population of $m$ users, requests a recommendation. The decision-maker observes the user and selects an item from a catalogue of $n$ items.
The performance of the recommendation algorithm is captured through its regret, considering as a reference an Oracle algorithm aware of these probabilities.
- Score: 73.58127515175127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a theoretical analysis of recommendation systems in an
online setting, where items are sequentially recommended to users over time. In
each round, a user, randomly picked from a population of $m$ users, requests a
recommendation. The decision-maker observes the user and selects an item from a
catalogue of $n$ items. Importantly, an item cannot be recommended twice to the
same user. The probabilities that a user likes each item are unknown. The
performance of the recommendation algorithm is captured through its regret,
considering as a reference an Oracle algorithm aware of these probabilities. We
investigate various structural assumptions on these probabilities: we derive
for each structure regret lower bounds, and devise algorithms achieving these
limits. Interestingly, our analysis reveals the relative weights of the
different components of regret: the component due to the constraint of not
presenting the same item twice to the same user, that due to learning the
chances users like items, and finally that arising when learning the underlying
structure.
Related papers
- Misalignment, Learning, and Ranking: Harnessing Users Limited Attention [16.74322664734553]
We study the design of online algorithms that obtain vanishing regret against optimal benchmarks.
We show how standard algorithms for adversarial online linear optimization can be used to obtain a payoff-time algorithm with $O(sqrtT)$ regret.
arXiv Detail & Related papers (2024-02-21T18:52:20Z) - Recommendation Systems with Distribution-Free Reliability Guarantees [83.80644194980042]
We show how to return a set of items rigorously guaranteed to contain mostly good items.
Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate.
We evaluate our methods on the Yahoo! Learning to Rank and MSMarco datasets.
arXiv Detail & Related papers (2022-07-04T17:49:25Z) - Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system.
Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations.
While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z) - Modeling Attrition in Recommender Systems with Departing Bandits [84.85560764274399]
We propose a novel multi-armed bandit setup that captures policy-dependent horizons.
We first address the case where all users share the same type, demonstrating that a recent UCB-based algorithm is optimal.
We then move forward to the more challenging case, where users are divided among two types.
arXiv Detail & Related papers (2022-03-25T02:30:54Z) - FINN.no Slates Dataset: A new Sequential Dataset Logging Interactions,
allViewed Items and Click Responses/No-Click for Recommender Systems Research [4.792216056979392]
We present a novel recommender systems dataset that records the sequential interactions between users and an online marketplace.
The dataset includes the presented slates at each round, whether the user clicked on any of these items and which item the user clicked on.
arXiv Detail & Related papers (2021-11-05T09:21:58Z) - Dynamic-K Recommendation with Personalized Decision Boundary [41.70842736417849]
We develop a dynamic-K recommendation task as a joint learning problem with both ranking and classification objectives.
We extend two state-of-the-art ranking-based recommendation methods, i.e., BPRMF and HRM, to the corresponding dynamic-K versions.
Our experimental results on two datasets show that the dynamic-K models are more effective than the original fixed-N recommendation methods.
arXiv Detail & Related papers (2020-12-25T13:02:57Z) - Learning over no-Preferred and Preferred Sequence of items for Robust
Recommendation [66.8722561224499]
We propose a theoretically founded sequential strategy for training large-scale Recommender Systems (RS) over implicit feedback.
We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach.
arXiv Detail & Related papers (2020-12-12T22:10:15Z) - Contextual User Browsing Bandits for Large-Scale Online Mobile
Recommendation [24.810164687987243]
Higher positions lead to more clicks for one commodity.
Only a few recommended items are shown at first glance and users need to slide the screen to browse other items.
Some recommended items ranked behind are not viewed by users and it is not proper to treat this kind of items as negative samples.
arXiv Detail & Related papers (2020-08-21T08:22:30Z) - Controllable Multi-Interest Framework for Recommendation [64.30030600415654]
We formalize the recommender system as a sequential recommendation problem.
We propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec.
Our framework has been successfully deployed on the offline Alibaba distributed cloud platform.
arXiv Detail & Related papers (2020-05-19T10:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.