ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2307.16186v2
- Date: Wed, 9 Aug 2023 09:06:22 GMT
- Title: ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning
- Authors: Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Jie Luo, Wenjun Wu
- Abstract summary: Multi-agent reinforcement learning (MARL) has achieved promising results in recent years.
This paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss.
- Score: 22.733348449818838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent reinforcement learning (MARL) has achieved promising results in
recent years. However, most existing reinforcement learning methods require a
large amount of data for model training. In addition, data-efficient
reinforcement learning requires the construction of strong inductive biases,
which are ignored in the current MARL approaches. Inspired by the symmetry
phenomenon in multi-agent systems, this paper proposes a framework for
exploiting prior knowledge by integrating data augmentation and a well-designed
consistency loss into the existing MARL methods. In addition, the proposed
framework is model-agnostic and can be applied to most of the current MARL
algorithms. Experimental tests on multiple challenging tasks demonstrate the
effectiveness of the proposed framework. Moreover, the proposed framework is
applied to a physical multi-robot testbed to show its superiority.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct [148.39859547619156]
We propose MMEvol, a novel multimodal instruction data evolution framework.
MMEvol iteratively improves data quality through a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution.
Our approach reaches state-of-the-art (SOTA) performance in nine tasks using significantly less data compared to state-of-the-art models.
arXiv Detail & Related papers (2024-09-09T17:44:00Z) - Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques [65.55451717632317]
We study Multi-Agent Reinforcement Learning from Human Feedback (MARLHF), exploring both theoretical foundations and empirical validations.
We define the task as identifying Nash equilibrium from a preference-only offline dataset in general-sum games.
Our findings underscore the multifaceted approach required for MARLHF, paving the way for effective preference-based multi-agent systems.
arXiv Detail & Related papers (2024-09-01T13:14:41Z) - Representation Learning For Efficient Deep Multi-Agent Reinforcement Learning [10.186029242664931]
We present MAPO-LSO which applies a form of comprehensive representation learning devised to supplement MARL training.
Specifically, MAPO-LSO proposes a multi-agent extension of transition dynamics reconstruction and self-predictive learning.
Empirical results demonstrate MAPO-LSO to show notable improvements in sample efficiency and learning performance compared to its vanilla MARL counterpart.
arXiv Detail & Related papers (2024-06-05T03:11:44Z) - Demonstration Guided Multi-Objective Reinforcement Learning [2.9845592719739127]
We introduce demonstration-guided multi-objective reinforcement learning (DG-MORL)
This novel approach utilizes prior demonstrations, aligns them with user preferences via corner weight support, and incorporates a self-evolving mechanism to refine suboptimal demonstrations.
Our empirical studies demonstrate DG-MORL's superiority over existing MORL algorithms, establishing its robustness and efficacy.
arXiv Detail & Related papers (2024-04-05T10:19:04Z) - Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement [69.51496713076253]
In this paper, we focus on the aforementioned efficiency aspects of existing MTL methods.
We first carry out large-scale experiments of the methods with smaller backbones and on a the MetaGraspNet dataset as a new test ground.
We also propose Feature Disentanglement measure as a novel and efficient identifier of the challenges in MTL.
arXiv Detail & Related papers (2024-02-05T22:15:55Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Model-based Multi-agent Reinforcement Learning: Recent Progress and
Prospects [23.347535672670688]
Multi-Agent Reinforcement Learning (MARL) tackles sequential decision-making problems involving multiple participants.
MARL requires a tremendous number of samples for effective training.
Model-based methods have been shown to achieve provable advantages of sample efficiency.
arXiv Detail & Related papers (2022-03-20T17:24:47Z) - MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement
Learning [36.14516028564416]
This paper proposes an innovative Multiple Model Kalman Temporal Difference (MM-KTD) framework to learn optimal control policies.
An active learning method is proposed to enhance the sampling efficiency of the system.
Experimental results show superiority of the MM-KTD framework in comparison to its state-of-the-art counterparts.
arXiv Detail & Related papers (2020-05-30T06:39:55Z) - Dynamic Knowledge embedding and tracing [18.717482292051788]
We propose a novel approach to knowledge tracing that combines techniques from matrix factorization with recent progress in recurrent neural networks (RNNs)
The proposed emphDynEmb framework enables the tracking of student knowledge even without the concept/skill tag information.
arXiv Detail & Related papers (2020-05-18T21:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.