Socially-Optimal Mechanism Design for Incentivized Online Learning
        - URL: http://arxiv.org/abs/2112.14338v1
- Date: Wed, 29 Dec 2021 00:21:40 GMT
- Title: Socially-Optimal Mechanism Design for Incentivized Online Learning
- Authors: Zhiyuan Wang and Lin Gao and Jianwei Huang
- Abstract summary: Multi-arm bandit (MAB) is a classic online learning framework that studies the sequential decision-making in an uncertain environment.
It is a practically important scenario in many applications such as spectrum sharing, crowdsensing, and edge computing.
This paper establishes the incentivized online learning (IOL) framework for this scenario.
- Score: 32.55657244414989
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Multi-arm bandit (MAB) is a classic online learning framework that studies
the sequential decision-making in an uncertain environment. The MAB framework,
however, overlooks the scenario where the decision-maker cannot take actions
(e.g., pulling arms) directly. It is a practically important scenario in many
applications such as spectrum sharing, crowdsensing, and edge computing. In
these applications, the decision-maker would incentivize other selfish agents
to carry out desired actions (i.e., pulling arms on the decision-maker's
behalf). This paper establishes the incentivized online learning (IOL)
framework for this scenario. The key challenge to design the IOL framework lies
in the tight coupling of the unknown environment learning and asymmetric
information revelation. To address this, we construct a special Lagrangian
function based on which we propose a socially-optimal mechanism for the IOL
framework. Our mechanism satisfies various desirable properties such as agent
fairness, incentive compatibility, and voluntary participation. It achieves the
same asymptotic performance as the state-of-art benchmark that requires extra
information. Our analysis also unveils the power of crowd in the IOL framework:
a larger agent crowd enables our mechanism to approach more closely the
theoretical upper bound of social performance. Numerical results demonstrate
the advantages of our mechanism in large-scale edge computing.
 
      
        Related papers
        - Will Pre-Training Ever End? A First Step Toward Next-Generation   Foundation MLLMs via Self-Improving Systematic Cognition [86.21199607040147]
 Self-Improving cognition (SIcog) is a self-learning framework for constructing next-generation foundation language models.
We introduce Chain-of-Description, a step-by-step visual understanding method, and integrate structured chain-of-thought (CoT) reasoning to support in-depth multimodal reasoning.
Extensive experiments demonstrate that SIcog produces next-generation foundation MLLMs with substantially improved multimodal cognition.
 arXiv  Detail & Related papers  (2025-03-16T00:25:13Z)
- Large Language Models for Multi-Facility Location Mechanism Design [16.88708405619343]
 Deep learning models have been proposed as alternatives to strategyproof mechanisms for multi-facility location.
We introduce a novel approach, named LLMMech, that addresses these limitations by incorporating large language models into an evolutionary framework.
Our experimental results, evaluated on various problem settings, demonstrate that the LLM-generated mechanisms generally outperform existing handcrafted baselines and deep learning models.
 arXiv  Detail & Related papers  (2025-03-12T16:49:56Z)
- Vintix: Action Model via In-Context Reinforcement Learning [72.65703565352769]
 We present the first steps toward scaling ICRL by introducing a fixed, cross-domain model capable of learning behaviors through in-context reinforcement learning.
Our results demonstrate that Algorithm Distillation, a framework designed to facilitate ICRL, offers a compelling and competitive alternative to expert distillation to construct versatile action models.
 arXiv  Detail & Related papers  (2025-01-31T18:57:08Z)
- Principal-Agent Reinforcement Learning: Orchestrating AI Agents with   Contracts [20.8288955218712]
 We propose a framework where a principal guides an agent in a Markov Decision Process (MDP) using a series of contracts.
We present and analyze a meta-algorithm that iteratively optimize the policies of the principal and agent.
We then scale our algorithm with deep Q-learning and analyze its convergence in the presence of approximation error.
 arXiv  Detail & Related papers  (2024-07-25T14:28:58Z)
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention   Networks [52.46420522934253]
 We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
 arXiv  Detail & Related papers  (2024-05-23T11:10:32Z)
- Refined Mechanism Design for Approximately Structured Priors via Active
  Regression [50.71772232237571]
 We consider the problem of a revenue-maximizing seller with a large number of items for sale to $n$ strategic bidders.
It is well-known that optimal and even approximately-optimal mechanisms for this setting are notoriously difficult to characterize or compute.
 arXiv  Detail & Related papers  (2023-10-11T20:34:17Z)
- A Novel Multiagent Flexibility Aggregation Framework [1.7132914341329848]
 We propose a novel DER aggregation framework, encompassing a multiagent architecture and various types of mechanisms for the effective management and efficient integration of DERs in the Grid.
One critical component of our architecture is the Local Flexibility Estimators (LFEs) agents, which are key for offloading the Aggregator from serious or resource-intensive responsibilities.
 arXiv  Detail & Related papers  (2023-07-17T11:36:15Z)
- ComplAI: Theory of A Unified Framework for Multi-factor Assessment of
  Black-Box Supervised Machine Learning Models [6.279863832853343]
 ComplAI is a unique framework to enable, observe, analyze and quantify explainability, robustness, performance, fairness, and model behavior.
It evaluates different supervised Machine Learning models not just from their ability to make correct predictions but from overall responsibility perspective.
 arXiv  Detail & Related papers  (2022-12-30T08:48:19Z)
- Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
  Reinforcement Learning [114.36124979578896]
 We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
 arXiv  Detail & Related papers  (2022-05-05T05:44:26Z)
- Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement
  Learning Approach [130.9259586568977]
 We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.
A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
 arXiv  Detail & Related papers  (2022-02-25T16:17:23Z)
- AutonoML: Towards an Integrated Framework for Autonomous Machine
  Learning [9.356870107137095]
 Review seeks to motivate a more expansive perspective on what constitutes an automated/autonomous ML system.
In doing so, we survey developments in the following research areas.
We develop a conceptual framework throughout the review, augmented by each topic, to illustrate one possible way of fusing high-level mechanisms into an autonomous ML system.
 arXiv  Detail & Related papers  (2020-12-23T11:01:10Z)
- Decentralized Reinforcement Learning: Global Decision-Making via Local
  Economic Transactions [80.49176924360499]
 We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
 arXiv  Detail & Related papers  (2020-07-05T16:41:09Z)
- Incentive Mechanism Design for Resource Sharing in Collaborative Edge
  Learning [106.51930957941433]
 In 5G and Beyond networks, Artificial Intelligence applications are expected to be increasingly ubiquitous.
This necessitates a paradigm shift from the current cloud-centric model training approach to the Edge Computing based collaborative learning scheme known as edge learning.
 arXiv  Detail & Related papers  (2020-05-31T12:45:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.