Selectively Contextual Bandits
- URL: http://arxiv.org/abs/2205.04528v1
- Date: Mon, 9 May 2022 19:47:46 GMT
- Title: Selectively Contextual Bandits
- Authors: Claudia Roberts and Maria Dimakopoulou and Qifeng Qiao and Ashok
Chandrashekhar and Tony Jebara
- Abstract summary: We propose a new online learning algorithm that preserves benefits of personalization while increasing the commonality in treatments across users.
Our approach selectively interpolates between a contextual bandit algorithm and a context-free multi-arm bandit.
We evaluate our approach in a classification setting using public datasets and show the benefits of the hybrid policy.
- Score: 11.438194383787604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contextual bandits are widely used in industrial personalization systems.
These online learning frameworks learn a treatment assignment policy in the
presence of treatment effects that vary with the observed contextual features
of the users. While personalization creates a rich user experience that reflect
individual interests, there are benefits of a shared experience across a
community that enable participation in the zeitgeist. Such benefits are
emergent through network effects and are not captured in regret metrics
typically employed in evaluating bandits. To balance these needs, we propose a
new online learning algorithm that preserves benefits of personalization while
increasing the commonality in treatments across users. Our approach selectively
interpolates between a contextual bandit algorithm and a context-free multi-arm
bandit and leverages the contextual information for a treatment decision only
if it promises significant gains. Apart from helping users of personalization
systems balance their experience between the individualized and shared,
simplifying the treatment assignment policy by making it selectively reliant on
the context can help improve the rate of learning in some cases. We evaluate
our approach in a classification setting using public datasets and show the
benefits of the hybrid policy.
Related papers
- Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems.
We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Neural Contextual Bandits for Personalized Recommendation [49.85090929163639]
This tutorial investigates the contextual bandits as a powerful framework for personalized recommendations.
We focus on the exploration perspective of contextual bandits to alleviate the Matthew Effect'' in recommender systems.
In addition to the conventional linear contextual bandits, we will also dedicated to neural contextual bandits.
arXiv Detail & Related papers (2023-12-21T17:03:26Z) - Kernelized Offline Contextual Dueling Bandits [15.646879026749168]
In this work, we take advantage of the fact that often the agent can choose contexts at which to obtain human feedback.
We give an upper-confidence-bound style algorithm for this setting and prove a regret bound.
arXiv Detail & Related papers (2023-07-21T01:17:31Z) - Multi-View Interactive Collaborative Filtering [0.0]
We propose a novel partially online latent factor recommender algorithm that incorporates both rating and contextual information.
MV-ICTR significantly increases performance on datasets with high percentages of cold-start users and items.
arXiv Detail & Related papers (2023-05-14T20:31:37Z) - Joint Multisided Exposure Fairness for Recommendation [76.75990595228666]
This paper formalizes a family of exposure fairness metrics that model the problem jointly from the perspective of both the consumers and producers.
Specifically, we consider group attributes for both types of stakeholders to identify and mitigate fairness concerns that go beyond individual users and items towards more systemic biases in recommendation.
arXiv Detail & Related papers (2022-04-29T19:13:23Z) - Top-K Ranking Deep Contextual Bandits for Information Selection Systems [0.0]
We propose a novel approach to top-K rankings under the contextual multi-armed bandit framework.
We model the reward function with a neural network to allow non-linear approximation to learn the relationship between rewards and contexts.
arXiv Detail & Related papers (2022-01-28T15:10:44Z) - Local Clustering in Contextual Multi-Armed Bandits [44.11480686973274]
We study identifying user clusters in contextual multi-armed bandits (MAB)
We propose a bandit algorithm, LOCB, embedded with local clustering procedure.
We evaluate the proposed algorithm from various aspects, which outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2021-02-26T21:59:29Z) - Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users'
Feedback [62.997667081978825]
We present a novel approach for considering user feedback and evaluate it using three distinct strategies.
Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
arXiv Detail & Related papers (2020-09-16T07:32:51Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z) - A Robust Reputation-based Group Ranking System and its Resistance to
Bribery [8.300507994596416]
We propose a new reputation-based ranking system, utilizing multipartite ratingworks.
We study its resistance to bribery and how to design optimal bribing strategies.
arXiv Detail & Related papers (2020-04-13T22:28:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.