Leveraging heterogeneous spillover effects in maximizing contextual
bandit rewards
- URL: http://arxiv.org/abs/2310.10259v1
- Date: Mon, 16 Oct 2023 10:34:41 GMT
- Title: Leveraging heterogeneous spillover effects in maximizing contextual
bandit rewards
- Authors: Ahmed Sayeed Faruk, Elena Zheleva
- Abstract summary: We propose a framework that allows contextual multi-armed bandits to account for such heterogeneous spillovers.
Our proposed method leads to significantly higher rewards than existing solutions that ignore spillover.
- Score: 12.533920403498453
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recommender systems relying on contextual multi-armed bandits continuously
improve relevant item recommendations by taking into account the contextual
information. The objective of these bandit algorithms is to learn the best arm
(i.e., best item to recommend) for each user and thus maximize the cumulative
rewards from user engagement with the recommendations. However, current
approaches ignore potential spillover between interacting users, where the
action of one user can impact the actions and rewards of other users. Moreover,
spillover may vary for different people based on their preferences and the
closeness of ties to other users. This leads to heterogeneity in the spillover
effects, i.e., the extent to which the action of one user can impact the action
of another. Here, we propose a framework that allows contextual multi-armed
bandits to account for such heterogeneous spillovers when choosing the best arm
for each user. By experimenting on several real-world datasets using prominent
linear and non-linear contextual bandit algorithms, we observe that our
proposed method leads to significantly higher rewards than existing solutions
that ignore spillover.
Related papers
- Neural Dueling Bandits [58.90189511247936]
We use a neural network to estimate the reward function using preference feedback for the previously selected arms.
We then extend our theoretical results to contextual bandit problems with binary feedback, which is in itself a non-trivial contribution.
arXiv Detail & Related papers (2024-07-24T09:23:22Z) - Contrastive Learning Method for Sequential Recommendation based on Multi-Intention Disentanglement [5.734747179463411]
We propose a Contrastive Learning sequential recommendation method based on Multi-Intention Disentanglement (MIDCL)
In our work, intentions are recognized as dynamic and diverse, and user behaviors are often driven by current multi-intentions.
We propose two types of contrastive learning paradigms for finding the most relevant user's interactive intention, and maximizing the mutual information of positive sample pairs.
arXiv Detail & Related papers (2024-04-28T15:13:36Z) - Incentive-Aware Recommender Systems in Two-Sided Markets [49.692453629365204]
We propose a novel recommender system that aligns with agents' incentives while achieving myopically optimal performance.
Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets.
Both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation.
arXiv Detail & Related papers (2022-11-23T22:20:12Z) - Recommendation with User Active Disclosing Willingness [20.306413327597603]
We study a novel recommendation paradigm, where the users are allowed to indicate their "willingness" on disclosing different behaviors.
We conduct extensive experiments to demonstrate the effectiveness of our model on balancing the recommendation quality and user disclosing willingness.
arXiv Detail & Related papers (2022-10-25T04:43:40Z) - Selectively Contextual Bandits [11.438194383787604]
We propose a new online learning algorithm that preserves benefits of personalization while increasing the commonality in treatments across users.
Our approach selectively interpolates between a contextual bandit algorithm and a context-free multi-arm bandit.
We evaluate our approach in a classification setting using public datasets and show the benefits of the hybrid policy.
arXiv Detail & Related papers (2022-05-09T19:47:46Z) - Modeling Attrition in Recommender Systems with Departing Bandits [84.85560764274399]
We propose a novel multi-armed bandit setup that captures policy-dependent horizons.
We first address the case where all users share the same type, demonstrating that a recent UCB-based algorithm is optimal.
We then move forward to the more challenging case, where users are divided among two types.
arXiv Detail & Related papers (2022-03-25T02:30:54Z) - BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender
System [0.0]
Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance between exploration and exploitation.
collaborative filtering (CF) is arguably the earliest and most influential method in the recommender system.
BanditMF is designed to address two challenges in the multi-armed bandits algorithm and collaborative filtering.
arXiv Detail & Related papers (2021-06-21T07:35:39Z) - User-oriented Fairness in Recommendation [21.651482297198687]
We address the unfairness problem in recommender systems from the user perspective.
We group users into advantaged and disadvantaged groups according to their level of activity.
Our approach can not only improve group fairness of users in recommender systems, but also achieve better overall recommendation performance.
arXiv Detail & Related papers (2021-04-21T17:50:31Z) - Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users'
Feedback [62.997667081978825]
We present a novel approach for considering user feedback and evaluate it using three distinct strategies.
Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
arXiv Detail & Related papers (2020-09-16T07:32:51Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z) - Reward Constrained Interactive Recommendation with Natural Language
Feedback [158.8095688415973]
We propose a novel constraint-augmented reinforcement learning (RL) framework to efficiently incorporate user preferences over time.
Specifically, we leverage a discriminator to detect recommendations violating user historical preference.
Our proposed framework is general and is further extended to the task of constrained text generation.
arXiv Detail & Related papers (2020-05-04T16:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.