Observational Learning with a Budget
- URL: http://arxiv.org/abs/2504.19396v1
- Date: Mon, 28 Apr 2025 00:12:30 GMT
- Title: Observational Learning with a Budget
- Authors: Shuo Wu, Pawan Poojary, Randall Berry,
- Abstract summary: We consider a model of observational learning in which a sequence of agents receives a private signal about an underlying binary state of the world.<n>A central planner seeks to improve the accuracy of these signals by allocating a limited budget to enhance signal quality across agents.
- Score: 0.3499870393443268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a model of Bayesian observational learning in which a sequence of agents receives a private signal about an underlying binary state of the world. Each agent makes a decision based on its own signal and its observations of previous agents. A central planner seeks to improve the accuracy of these signals by allocating a limited budget to enhance signal quality across agents. We formulate and analyze the budget allocation problem and propose two optimal allocation strategies. At least one of these strategies is shown to maximize the probability of achieving a correct information cascade.
Related papers
- Bayesian Persuasion with Externalities: Exploiting Agent Types [21.508431216175143]
We study a Bayesian persuasion problem with externalities.<n>In this model, a principal sends signals to inform multiple agents about the state of the world.<n>We study the problem of computing optimal signaling strategies for the principal.
arXiv Detail & Related papers (2024-12-17T12:41:17Z) - Distributionally Robust Inverse Reinforcement Learning for Identifying Multi-Agent Coordinated Sensing [13.440621354486906]
We derive a minimax distributionally robust inverse reinforcement learning (IRL) algorithm to reconstruct the utility functions of a multi-agent sensing system.
We prove the equivalence between this robust estimation and a semi-infinite optimization reformulation, and we propose a consistent algorithm to compute solutions.
arXiv Detail & Related papers (2024-09-22T17:44:32Z) - Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques [65.55451717632317]
We study Preference-Based Multi-Agent Reinforcement Learning (PbMARL)
We identify the Nash equilibrium from a preference-only offline dataset in general-sum games.
Our findings underscore the multifaceted approach required for PbMARL.
arXiv Detail & Related papers (2024-09-01T13:14:41Z) - Persuasion, Delegation, and Private Information in Algorithm-Assisted
Decisions [0.0]
A principal designs an algorithm that generates a publicly observable prediction of a binary state.
She must decide whether to act directly based on the prediction or to delegate the decision to an agent with private information but potential misalignment.
We study the optimal design of the prediction algorithm and the delegation rule in such environments.
arXiv Detail & Related papers (2024-02-14T18:32:30Z) - Bandit Pareto Set Identification: the Fixed Budget Setting [10.967572582187014]
We study a pure exploration problem in a multi-armed bandit model.
The goal is to identify the distributions whose mean is not uniformly worse than that of another distribution.
arXiv Detail & Related papers (2023-11-07T13:43:18Z) - Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward.
We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy.
We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - Quantization for decentralized learning under subspace constraints [61.59416703323886]
We consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints.
We propose and study an adaptive decentralized strategy where the agents employ differential randomized quantizers to compress their estimates.
The analysis shows that, under some general conditions on the quantization noise, the strategy is stable both in terms of mean-square error and average bit rate.
arXiv Detail & Related papers (2022-09-16T09:38:38Z) - Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z) - Informational Design of Dynamic Multi-Agent System [32.37168850559519]
We study how the craft of payoffrelevant environmental signals solely can influence the behaviors of intelligent agents.
An obedient principle is established which states that it is without loss of generality to focus on the direct information design.
A framework is proposed based on an approach which we refer to as the fixed-point alignment that incentivizes the agents to choose the signal sent by the principal.
arXiv Detail & Related papers (2021-05-07T03:46:14Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.