Incentivizing Exploration in Linear Bandits under Information Gap
- URL: http://arxiv.org/abs/2104.03860v1
- Date: Thu, 8 Apr 2021 16:01:56 GMT
- Title: Incentivizing Exploration in Linear Bandits under Information Gap
- Authors: Huazheng Wang, Haifeng Xu, Chuanhao Li, Zhiyuan Liu, Hongning Wang
- Abstract summary: We study the problem of incentivizing exploration for myopic users in linear bandits.
In order to maximize the long-term reward, the system offers compensation to incentivize the users to pull the exploratory arms.
- Score: 50.220743323750035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of incentivizing exploration for myopic users in linear
bandits, where the users tend to exploit arm with the highest predicted reward
instead of exploring. In order to maximize the long-term reward, the system
offers compensation to incentivize the users to pull the exploratory arms, with
the goal of balancing the trade-off among exploitation, exploration and
compensation. We consider a new and practically motivated setting where the
context features observed by the user are more informative than those used by
the system, e.g., features based on users' private information are not
accessible by the system. We propose a new method to incentivize exploration
under such information gap, and prove that the method achieves both sublinear
regret and sublinear compensation. We theoretical and empirically analyze the
added compensation due to the information gap, compared with the case that the
system has access to the same context features as the user, i.e., without
information gap. We also provide a compensation lower bound of our problem.
Related papers
- Exploiting Correlated Auxiliary Feedback in Parameterized Bandits [56.84649080789685]
We study a novel variant of the parameterized bandits problem in which the learner can observe additional auxiliary feedback that is correlated with the observed reward.
The auxiliary feedback is readily available in many real-life applications, e.g., an online platform that wants to recommend the best-rated services to its users can observe the user's rating of service (rewards) and collect additional information like service delivery time (auxiliary feedback)
arXiv Detail & Related papers (2023-11-05T17:27:06Z) - Explainable Active Learning for Preference Elicitation [0.0]
We employ Active Learning (AL) to solve the addressed problem with the objective of maximizing information acquisition with minimal user effort.
AL operates for selecting informative data from a large unlabeled set to inquire an oracle to label them.
It harvests user feedback (given for the system's explanations on the presented items) over informative samples to update an underlying machine learning (ML) model.
arXiv Detail & Related papers (2023-09-01T09:22:33Z) - Consumer-side Fairness in Recommender Systems: A Systematic Survey of
Methods and Evaluation [1.4123323039043334]
Growing awareness of discrimination in machine learning methods motivated both academia and industry to research how fairness can be ensured in recommender systems.
For recommender systems, such issues are well exemplified by occupation recommendation, where biases in historical data may lead to recommender systems relating one gender to lower wages or to the propagation of stereotypes.
This survey serves as a systematic overview and discussion of the current research on consumer-side fairness in recommender systems.
arXiv Detail & Related papers (2023-05-16T10:07:41Z) - PIE: Personalized Interest Exploration for Large-Scale Recommender
Systems [0.0]
We present a framework for exploration in large-scale recommender systems to address these challenges.
Our methodology can be easily integrated into an existing large-scale recommender system with minimal modifications.
Our work has been deployed in production on Facebook Watch, a popular video discovery and sharing platform serving billions of users.
arXiv Detail & Related papers (2023-04-13T22:25:09Z) - Incentive-Aware Recommender Systems in Two-Sided Markets [49.692453629365204]
We propose a novel recommender system that aligns with agents' incentives while achieving myopically optimal performance.
Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets.
Both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation.
arXiv Detail & Related papers (2022-11-23T22:20:12Z) - FedGRec: Federated Graph Recommender System with Lazy Update of Latent
Embeddings [108.77460689459247]
We propose a Federated Graph Recommender System (FedGRec) to mitigate privacy concerns.
In our system, users and the server explicitly store latent embeddings for users and items, where the latent embeddings summarize different orders of indirect user-item interactions.
We perform extensive empirical evaluations to verify the efficacy of using latent embeddings as a proxy of missing interaction graph.
arXiv Detail & Related papers (2022-10-25T01:08:20Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z) - Hierarchical Adaptive Contextual Bandits for Resource Constraint based
Recommendation [49.69139684065241]
Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems.
In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.
arXiv Detail & Related papers (2020-04-02T17:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.