Optimal coordination in Minority Game: A solution from reinforcement
learning
- URL: http://arxiv.org/abs/2312.14970v1
- Date: Wed, 20 Dec 2023 00:47:45 GMT
- Title: Optimal coordination in Minority Game: A solution from reinforcement
learning
- Authors: Guozhong Zheng, Weiran Cai, Guanxiao Qi, Jiqiang Zhang, and Li Chen
- Abstract summary: The Minority Game is perhaps the simplest model that provides insights into how human coordinate to maximize the resource utilization.
Here, we turn to the paradigm of reinforcement learning, where individuals' strategies are evolving by evaluating both the past experience and rewards in the future.
We reveal that the population is able to reach the optimal allocation when individuals appreciate both the past experience and rewards in the future.
- Score: 6.0413802011767705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient allocation is important in nature and human society where
individuals often compete for finite resources. The Minority Game is perhaps
the simplest model that provides deep insights into how human coordinate to
maximize the resource utilization. However, this model assumes the static
strategies that are provided a priori, failing to capture their adaptive
nature. Here, we turn to the paradigm of reinforcement learning, where
individuals' strategies are evolving by evaluating both the past experience and
rewards in the future. Specifically, we adopt the Q-learning algorithm, each
player is endowed with a Q-table that guides their decision-making. We reveal
that the population is able to reach the optimal allocation when individuals
appreciate both the past experience and rewards in the future, and they are
able to balance the exploitation of their Q-tables and the exploration by
randomly acting. The optimal allocation is ruined when individuals tend to use
either exploitation-only or exploration-only, where only partial coordination
and even anti-coordination are observed. Mechanism analysis reveals that a
moderate level of exploration can escape local minimums of metastable periodic
states, and reaches the optimal coordination as the global minimum.
Interestingly, the optimal coordination is underlined by a symmetry-breaking of
action preferences, where nearly half of the population choose one side while
the other half prefer the other side. The emergence of optimal coordination is
robust to the population size and other game parameters. Our work therefore
provides a natural solution to the Minority Game and sheds insights into the
resource allocation problem in general. Besides, our work demonstrates the
potential of the proposed reinforcement learning paradigm in deciphering many
puzzles in the socio-economic context.
Related papers
- Learning to Assist Humans without Inferring Rewards [65.28156318196397]
We build upon prior work that studies assistance through the lens of empowerment.
An assistive agent aims to maximize the influence of the human's actions.
We prove that these representations estimate a similar notion of empowerment to that studied by prior work.
arXiv Detail & Related papers (2024-11-04T21:31:04Z) - Learning in Multi-Objective Public Goods Games with Non-Linear Utilities [8.243788683895376]
We study learning in a novel multi-objective version of the Public Goods Game where agents have different risk preferences.
We study the interplay between such preference modelling and environmental uncertainty on the incentive alignment level in the game.
arXiv Detail & Related papers (2024-08-01T16:24:37Z) - MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with
Diverse Human Preferences [101.57443597426374]
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.
We learn a mixture of preference distributions via an expectation-maximization algorithm to better represent diverse human preferences.
Our algorithm achieves an average improvement of more than 16% in win-rates over conventional RLHF algorithms.
arXiv Detail & Related papers (2024-02-14T03:56:27Z) - A Minimaximalist Approach to Reinforcement Learning from Human Feedback [49.45285664482369]
We present Self-Play Preference Optimization (SPO), an algorithm for reinforcement learning from human feedback.
Our approach is minimalist in that it does not require training a reward model nor unstable adversarial training.
We demonstrate that on a suite of continuous control tasks, we are able to learn significantly more efficiently than reward-model based approaches.
arXiv Detail & Related papers (2024-01-08T17:55:02Z) - Unsupervised Resource Allocation with Graph Neural Networks [0.0]
We present an approach for maximizing a global utility function by learning how to allocate resources in an unsupervised way.
We propose to learn the reward structure for near-optimal allocation policies with a GNN.
arXiv Detail & Related papers (2021-06-17T18:44:04Z) - Learning Strategies in Decentralized Matching Markets under Uncertain
Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori.
Our approach is based on the representation of preferences in a reproducing kernel Hilbert space.
We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z) - On Information Asymmetry in Competitive Multi-Agent Reinforcement
Learning: Convergence and Optimality [78.76529463321374]
We study the system of interacting non-cooperative two Q-learning agents.
We show that this information asymmetry can lead to a stable outcome of population learning.
arXiv Detail & Related papers (2020-10-21T11:19:53Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.