Relational Reasoning via Set Transformers: Provable Efficiency and
Applications to MARL
- URL: http://arxiv.org/abs/2209.09845v1
- Date: Tue, 20 Sep 2022 16:42:59 GMT
- Title: Relational Reasoning via Set Transformers: Provable Efficiency and
Applications to MARL
- Authors: Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent Y. F. Tan, Zhuoran
Yang, Zhaoran Wang
- Abstract summary: A cooperative Multi-A gent R einforcement Learning (MARL) with permutation invariant agents framework has achieved tremendous empirical successes in real-world applications.
Unfortunately, the theoretical understanding of this MARL problem is lacking due to the curse of many agents and the limited exploration of the relational reasoning in existing works.
We prove that the suboptimality gaps of the model-free and model-based algorithms are independent of and logarithmic in the number of agents respectively, which mitigates the curse of many agents.
- Score: 154.13105285663656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cooperative Multi-A gent R einforcement Learning (MARL) with permutation
invariant agents framework has achieved tremendous empirical successes in
real-world applications. Unfortunately, the theoretical understanding of this
MARL problem is lacking due to the curse of many agents and the limited
exploration of the relational reasoning in existing works. In this paper, we
verify that the transformer implements complex relational reasoning, and we
propose and analyze model-free and model-based offline MARL algorithms with the
transformer approximators. We prove that the suboptimality gaps of the
model-free and model-based algorithms are independent of and logarithmic in the
number of agents respectively, which mitigates the curse of many agents. These
results are consequences of a novel generalization error bound of the
transformer and a novel analysis of the Maximum Likelihood Estimate (MLE) of
the system dynamics with the transformer. Our model-based algorithm is the
first provably efficient MARL algorithm that explicitly exploits the
permutation invariance of the agents.
Related papers
- Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization [15.898378661128334]
Reinforcement Learning (RL) algorithms are known to suffer from the curse of dimensionality.
We propose overcoming the curse of dimensionality by approximately factorizing the original Markov decision processes (MDPs) into smaller, independently evolving MDPs.
We provide improved sample complexity guarantees for both proposed algorithms.
arXiv Detail & Related papers (2024-11-12T07:08:00Z) - Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis [30.713243690224207]
In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes.
This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with strong convergence and performance guarantees.
arXiv Detail & Related papers (2024-10-31T16:53:20Z) - Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers [12.986126243018452]
We introduce the Softmax-Linked Additive Log-Odds Model (SLALOM), a novel surrogate model specifically designed to align with the transformer framework.
SLALOM demonstrates the capacity to deliver a range of faithful and insightful explanations across both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-22T11:14:00Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - Permutation Invariant Policy Optimization for Mean-Field Multi-Agent
Reinforcement Learning: A Principled Approach [128.62787284435007]
We propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture.
We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence.
In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors.
arXiv Detail & Related papers (2021-05-18T04:35:41Z) - A variational inference framework for inverse problems [0.39373541926236766]
A framework is presented for fitting inverse problem models via variational Bayes approximations.
This methodology guarantees flexibility to statistical model specification for a broad range of applications.
An image processing application and a simulation exercise motivated by biomedical problems reveal the computational advantage offered by variational Bayes.
arXiv Detail & Related papers (2021-03-10T07:37:20Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.