Related papers: Selectively Contextual Bandits

Selectively Contextual Bandits

URL: http://arxiv.org/abs/2205.04528v1
Date: Mon, 9 May 2022 19:47:46 GMT
Title: Selectively Contextual Bandits
Authors: Claudia Roberts and Maria Dimakopoulou and Qifeng Qiao and Ashok Chandrashekhar and Tony Jebara
Abstract summary: We propose a new online learning algorithm that preserves benefits of personalization while increasing the commonality in treatments across users. Our approach selectively interpolates between a contextual bandit algorithm and a context-free multi-arm bandit. We evaluate our approach in a classification setting using public datasets and show the benefits of the hybrid policy.
Score: 11.438194383787604
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Contextual bandits are widely used in industrial personalization systems. These online learning frameworks learn a treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the users. While personalization creates a rich user experience that reflect individual interests, there are benefits of a shared experience across a community that enable participation in the zeitgeist. Such benefits are emergent through network effects and are not captured in regret metrics typically employed in evaluating bandits. To balance these needs, we propose a new online learning algorithm that preserves benefits of personalization while increasing the commonality in treatments across users. Our approach selectively interpolates between a contextual bandit algorithm and a context-free multi-arm bandit and leverages the contextual information for a treatment decision only if it promises significant gains. Apart from helping users of personalization systems balance their experience between the individualized and shared, simplifying the treatment assignment policy by making it selectively reliant on the context can help improve the rate of learning in some cases. We evaluate our approach in a classification setting using public datasets and show the benefits of the hybrid policy.

Related papers

Interactive Visualization Recommendation with Hier-SUCB [52.11209329270573]
We propose an interactive personalized visualization recommendation (PVisRec) system that learns on user feedback from previous interactions. For more interactive and accurate recommendations, we propose Hier-SUCB, a contextual semi-bandit in the PVisRec setting.
arXiv Detail & Related papers (2025-02-05T17:14:45Z)
Online Clustering of Dueling Bandits [59.09590979404303]
We introduce the first "clustering of dueling bandit algorithms" to enable collaborative decision-making based on preference feedback. We propose two novel algorithms: (1) Clustering of Linear Dueling Bandits (COLDB) which models the user reward functions as linear functions of the context vectors, and (2) Clustering of Neural Dueling Bandits (CONDB) which uses a neural network to model complex, non-linear user reward functions.
arXiv Detail & Related papers (2025-02-04T07:55:41Z)
Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z)
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z)
Neural Contextual Bandits for Personalized Recommendation [49.85090929163639]
This tutorial investigates the contextual bandits as a powerful framework for personalized recommendations. We focus on the exploration perspective of contextual bandits to alleviate the Matthew Effect'' in recommender systems. In addition to the conventional linear contextual bandits, we will also dedicated to neural contextual bandits.
arXiv Detail & Related papers (2023-12-21T17:03:26Z)
Leveraging heterogeneous spillover in maximizing contextual bandit rewards [10.609670658904562]
We present a framework that allows contextual multi-armed bandits to account for such heterogeneous spillovers. Our framework leads to significantly higher rewards than existing state-of-the-art solutions.
arXiv Detail & Related papers (2023-10-16T10:34:41Z)
Kernelized Offline Contextual Dueling Bandits [15.646879026749168]
In this work, we take advantage of the fact that often the agent can choose contexts at which to obtain human feedback. We give an upper-confidence-bound style algorithm for this setting and prove a regret bound.
arXiv Detail & Related papers (2023-07-21T01:17:31Z)
Multi-View Interactive Collaborative Filtering [0.0]
We propose a novel partially online latent factor recommender algorithm that incorporates both rating and contextual information. MV-ICTR significantly increases performance on datasets with high percentages of cold-start users and items.
arXiv Detail & Related papers (2023-05-14T20:31:37Z)
Joint Multisided Exposure Fairness for Recommendation [76.75990595228666]
This paper formalizes a family of exposure fairness metrics that model the problem jointly from the perspective of both the consumers and producers. Specifically, we consider group attributes for both types of stakeholders to identify and mitigate fairness concerns that go beyond individual users and items towards more systemic biases in recommendation.
arXiv Detail & Related papers (2022-04-29T19:13:23Z)
Top-K Ranking Deep Contextual Bandits for Information Selection Systems [0.0]
We propose a novel approach to top-K rankings under the contextual multi-armed bandit framework. We model the reward function with a neural network to allow non-linear approximation to learn the relationship between rewards and contexts.
arXiv Detail & Related papers (2022-01-28T15:10:44Z)
Local Clustering in Contextual Multi-Armed Bandits [44.11480686973274]
We study identifying user clusters in contextual multi-armed bandits (MAB) We propose a bandit algorithm, LOCB, embedded with local clustering procedure. We evaluate the proposed algorithm from various aspects, which outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2021-02-26T21:59:29Z)
Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback [62.997667081978825]
We present a novel approach for considering user feedback and evaluate it using three distinct strategies. Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
arXiv Detail & Related papers (2020-09-16T07:32:51Z)
Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups. We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users. We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z)
A Robust Reputation-based Group Ranking System and its Resistance to Bribery [8.300507994596416]
We propose a new reputation-based ranking system, utilizing multipartite ratingworks. We study its resistance to bribery and how to design optimal bribing strategies.
arXiv Detail & Related papers (2020-04-13T22:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.