Regret Analysis of Repeated Delegated Choice
- URL: http://arxiv.org/abs/2310.04884v3
- Date: Wed, 14 Feb 2024 00:07:57 GMT
- Title: Regret Analysis of Repeated Delegated Choice
- Authors: MohammadTaghi Hajiaghayi, Mohammad Mahdavi, Keivan Rezaei, Suho Shin
- Abstract summary: We present a study on a repeated delegated choice problem, which is the first to consider an online learning variant of Kleinberg and Kleinberg, EC'18.
We explore two dimensions of the problem setup, whether the agent behaves myopically or strategizes across the rounds, and whether the solutions yield deterministic or utility.
- Score: 8.384985977301174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a study on a repeated delegated choice problem, which is the first
to consider an online learning variant of Kleinberg and Kleinberg, EC'18. In
this model, a principal interacts repeatedly with an agent who possesses an
exogenous set of solutions to search for efficient ones. Each solution can
yield varying utility for both the principal and the agent, and the agent may
propose a solution to maximize its own utility in a selfish manner. To mitigate
this behavior, the principal announces an eligible set which screens out a
certain set of solutions. The principal, however, does not have any information
on the distribution of solutions in advance. Therefore, the principal
dynamically announces various eligible sets to efficiently learn the
distribution. The principal's objective is to minimize cumulative regret
compared to the optimal eligible set in hindsight. We explore two dimensions of
the problem setup, whether the agent behaves myopically or strategizes across
the rounds, and whether the solutions yield deterministic or stochastic
utility. Our analysis mainly characterizes some regimes under which the
principal can recover the sublinear regret, thereby shedding light on the rise
and fall of the repeated delegation procedure in various regimes.
Related papers
- Active Learning for Fair and Stable Online Allocations [6.23798328186465]
We consider feedback from a select subset of agents at each epoch of the online resource allocation process.
Our algorithms provide regret bounds that are sub-linear in number of time-periods for various measures.
We show that efficient decision-making does not require extensive feedback and produces efficient outcomes for a variety of problem classes.
arXiv Detail & Related papers (2024-06-20T23:23:23Z) - Incentivized Learning in Principal-Agent Bandit Games [62.41639598376539]
This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.
The principal can influence the agent's decisions by offering incentives which add up to his rewards.
We present nearly optimal learning algorithms for the principal's regret in both multi-armed and linear contextual settings.
arXiv Detail & Related papers (2024-03-06T16:00:46Z) - Principal-Agent Reward Shaping in MDPs [50.914110302917756]
Principal-agent problems arise when one party acts on behalf of another, leading to conflicts of interest.
We study a two-player Stack game where the principal and the agent have different reward functions, and the agent chooses an MDP policy for both players.
Our results establish trees and deterministic decision processes with a finite horizon.
arXiv Detail & Related papers (2023-12-30T18:30:44Z) - Learning Fair Policies for Multi-stage Selection Problems from
Observational Data [4.282745020665833]
We consider the problem of learning fair policies for multi-stage selection problems from observational data.
This problem arises in several high-stakes domains such as company hiring, loan approval, or bail decisions where outcomes are only observed for those selected.
We propose a multi-stage framework that can be augmented with various fairness constraints, such as demographic parity or equal opportunity.
arXiv Detail & Related papers (2023-12-20T16:33:15Z) - Causal Strategic Learning with Competitive Selection [10.237954203296187]
We study the problem of agent selection in causal strategic learning under multiple decision makers.
We show that the optimal selection rule is a trade-off between selecting the best agents and providing incentives to maximise the agents' improvement.
We provide a cooperative protocol which all decision makers must collectively adopt to recover the true causal parameters.
arXiv Detail & Related papers (2023-08-30T18:43:11Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - Sample-Efficient Multi-Objective Learning via Generalized Policy
Improvement Prioritization [8.836422771217084]
Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences.
We introduce a novel algorithm that uses Generalized Policy Improvement (GPI) to define principled, formally-derived prioritization schemes.
We empirically show that our method outperforms state-of-the-art MORL algorithms in challenging multi-objective tasks.
arXiv Detail & Related papers (2023-01-18T20:54:40Z) - A Mutual Information Maximization Approach for the Spurious Solution
Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals.
There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance.
We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z) - End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games.
We propose two approaches, respectively based on explicit and implicit differentiation.
The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z) - Learning What to Defer for Maximum Independent Sets [84.00112106334655]
We propose a novel DRL scheme, coined learning what to defer (LwD), where the agent adaptively shrinks or stretch the number of stages by learning to distribute the element-wise decisions of the solution at each stage.
We apply the proposed framework to the maximum independent set (MIS) problem, and demonstrate its significant improvement over the current state-of-the-art DRL scheme.
arXiv Detail & Related papers (2020-06-17T02:19:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.