Bandits Under The Influence (Extended Version)
- URL: http://arxiv.org/abs/2009.10135v1
- Date: Mon, 21 Sep 2020 19:02:00 GMT
- Title: Bandits Under The Influence (Extended Version)
- Authors: Silviu Maniu, Stratis Ioannidis, Bogdan Cautis
- Abstract summary: We present online recommendation algorithms rooted in the linear multi-armed bandit literature.
Our bandit algorithms are tailored precisely to recommendation scenarios where user interests evolve under social influence.
- Score: 14.829802725813868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recommender systems should adapt to user interests as the latter evolve. A
prevalent cause for the evolution of user interests is the influence of their
social circle. In general, when the interests are not known, online algorithms
that explore the recommendation space while also exploiting observed
preferences are preferable. We present online recommendation algorithms rooted
in the linear multi-armed bandit literature. Our bandit algorithms are tailored
precisely to recommendation scenarios where user interests evolve under social
influence. In particular, we show that our adaptations of the classic LinREL
and Thompson Sampling algorithms maintain the same asymptotic regret bounds as
in the non-social case. We validate our approach experimentally using both
synthetic and real datasets.
Related papers
- When Online Algorithms Influence the Environment: A Dynamical Systems Analysis of the Unintended Consequences [5.4209739979186295]
We analyze the effect that online algorithms have on the environment that they are learning.
We show that when the recommendation algorithm is able to learn the population preferences in the presence of this mismatch, the algorithm induces similarity in the preferences of the user population.
arXiv Detail & Related papers (2024-11-21T06:47:53Z) - Algorithmic Drift: A Simulation Framework to Study the Effects of Recommender Systems on User Preferences [7.552217586057245]
We propose a simulation framework that mimics user-recommender system interactions in a long-term scenario.
We introduce two novel metrics for quantifying the algorithm's impact on user preferences, specifically in terms of drift over time.
arXiv Detail & Related papers (2024-09-24T21:54:22Z) - Neural Dueling Bandits [58.90189511247936]
We use a neural network to estimate the reward function using preference feedback for the previously selected arms.
We then extend our theoretical results to contextual bandit problems with binary feedback, which is in itself a non-trivial contribution.
arXiv Detail & Related papers (2024-07-24T09:23:22Z) - Neural Contextual Bandits for Personalized Recommendation [49.85090929163639]
This tutorial investigates the contextual bandits as a powerful framework for personalized recommendations.
We focus on the exploration perspective of contextual bandits to alleviate the Matthew Effect'' in recommender systems.
In addition to the conventional linear contextual bandits, we will also dedicated to neural contextual bandits.
arXiv Detail & Related papers (2023-12-21T17:03:26Z) - Incentive-Aware Recommender Systems in Two-Sided Markets [49.692453629365204]
We propose a novel recommender system that aligns with agents' incentives while achieving myopically optimal performance.
Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets.
Both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation.
arXiv Detail & Related papers (2022-11-23T22:20:12Z) - Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system.
Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations.
While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z) - Bayesian Non-stationary Linear Bandits for Large-Scale Recommender
Systems [6.009759445555003]
We build upon the linear contextual multi-armed bandit framework to address this problem.
We develop a decision-making policy for a linear bandit problem with high-dimensional feature vectors.
Our proposed recommender system employs this policy to learn the users' item preferences online while minimizing runtime.
arXiv Detail & Related papers (2022-02-07T13:51:19Z) - Control Variates for Slate Off-Policy Evaluation [112.35528337130118]
We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions.
We obtain new estimators with risk improvement guarantees over both the PI and self-normalized PI estimators.
arXiv Detail & Related papers (2021-06-15T06:59:53Z) - Learning User Preferences in Non-Stationary Environments [42.785926822853746]
We introduce a novel model for online non-stationary recommendation systems.
We show that our algorithm outperforms other static algorithms even when preferences do not change over time.
arXiv Detail & Related papers (2021-01-29T10:26:16Z) - Non-Stationary Latent Bandits [68.21614490603758]
We propose a practical approach for fast personalization to non-stationary users.
The key idea is to frame this problem as a latent bandit, where prototypical models of user behavior are learned offline and the latent state of the user is inferred online.
We propose Thompson sampling algorithms for regret minimization in non-stationary latent bandits, analyze them, and evaluate them on a real-world dataset.
arXiv Detail & Related papers (2020-12-01T10:31:57Z) - A Soft Recommender System for Social Networks [1.8275108630751844]
Recent social recommender systems benefit from friendship graph to make an accurate recommendation.
We went a step further to identify true friends for making even more realistic recommendations.
We calculated the similarity between users, as well as the dependency between a user and an item.
arXiv Detail & Related papers (2020-01-08T13:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.