MESOB: Balancing Equilibria & Social Optimality
- URL: http://arxiv.org/abs/2307.07911v1
- Date: Sun, 16 Jul 2023 00:43:54 GMT
- Title: MESOB: Balancing Equilibria & Social Optimality
- Authors: Xin Guo, Lihong Li, Sareh Nabi, Rabih Salhab, Junzi Zhang
- Abstract summary: Motivated by bid recommendation in online ad auctions, this paper considers a class of multi-level and multi-agent games.
We propose a novel and tractable bi-objective optimization formulation with mean-field approximation.
MESOB-OMO enables obtaining approximately efficient solutions in terms of the dual objectives of competition and cooperation.
- Score: 12.702156510015628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by bid recommendation in online ad auctions, this paper considers a
general class of multi-level and multi-agent games, with two major
characteristics: one is a large number of anonymous agents, and the other is
the intricate interplay between competition and cooperation. To model such
complex systems, we propose a novel and tractable bi-objective optimization
formulation with mean-field approximation, called MESOB (Mean-field Equilibria
& Social Optimality Balancing), as well as an associated occupation measure
optimization (OMO) method called MESOB-OMO to solve it. MESOB-OMO enables
obtaining approximately Pareto efficient solutions in terms of the dual
objectives of competition and cooperation in MESOB, and in particular allows
for Nash equilibrium selection and social equalization in an asymptotic manner.
We apply MESOB-OMO to bid recommendation in a simulated pay-per-click ad
auction. Experiments demonstrate its efficacy in balancing the interests of
different parties and in handling the competitive nature of bidders, as well as
its advantages over baselines that only consider either the competitive or the
cooperative aspects.
Related papers
- Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games [40.05960121330012]
Multi-agent reinforcement learning (MARL) lies at the heart of a plethora of applications involving the interaction of a group of agents in a shared unknown environment.
We propose a novel model-based algorithm, called VMG, that incentivizes exploration via biasing the empirical estimate of the model parameters.
arXiv Detail & Related papers (2025-02-13T21:28:51Z) - LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - Interpolating Item and User Fairness in Multi-Sided Recommendations [13.635310806431198]
We introduce a novel fair recommendation framework, Problem (FAIR)
We propose a low-regret algorithm FORM that concurrently performs real-time learning and fair recommendations, two tasks that are often at odds.
We demonstrate the efficacy of our framework and method in maintaining platform revenue while ensuring desired levels of fairness for both items and users.
arXiv Detail & Related papers (2023-06-12T15:00:58Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in
Online Advertising [53.636153252400945]
We propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies.
Our approach outperforms several baseline methods in terms of social welfare and guarantees the ad platform's revenue.
arXiv Detail & Related papers (2021-06-11T08:07:14Z) - Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization
in e-Commercial Sponsored Search [26.117969395228503]
We propose a novel multi-objective cooperative bid optimization formulation called Multi-Agent Cooperative bidding Games (MACG)
A global objective to maximize the overall profit of all advertisements is added in order to encourage better cooperation and also to protect self-bidding advertisers.
offline experiments and online A/B tests conducted on the Taobao platform indicate both single advertiser's objective and global profit have been significantly improved.
arXiv Detail & Related papers (2021-06-08T03:18:28Z) - Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise
Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL)
We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z) - Balancing Rational and Other-Regarding Preferences in
Cooperative-Competitive Environments [4.705291741591329]
Mixed environments are notorious for the conflicts of selfish and social interests.
We propose BAROCCO to balance individual and social incentives.
Our meta-algorithm is compatible with both Q-learning and Actor-Critic frameworks.
arXiv Detail & Related papers (2021-02-24T14:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.