Related papers: Symmetry-Guided Multi-Agent Inverse Reinforcement Learning

Symmetry-Guided Multi-Agent Inverse Reinforcement Learning

URL: http://arxiv.org/abs/2509.08257v2
Date: Thu, 11 Sep 2025 02:34:59 GMT
Title: Symmetry-Guided Multi-Agent Inverse Reinforcement Learning
Authors: Yongkai Tian, Yirong Qi, Xin Yu, Wenjun Wu, Jie Luo,
Abstract summary: Inverse Reinforcement Learning (IRL) addresses the problem by inferring implicit reward functions from expert demonstrations.<n>Existing methods rely heavily on large amounts of expert demonstrations to accurately recover the reward function.<n>We propose a universal framework that integrates symmetry into existing multi-agent adversarial IRL algorithms.
Score: 13.021166929646718
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In robotic systems, the performance of reinforcement learning depends on the rationality of predefined reward functions. However, manually designed reward functions often lead to policy failures due to inaccuracies. Inverse Reinforcement Learning (IRL) addresses this problem by inferring implicit reward functions from expert demonstrations. Nevertheless, existing methods rely heavily on large amounts of expert demonstrations to accurately recover the reward function. The high cost of collecting expert demonstrations in robotic applications, particularly in multi-robot systems, severely hinders the practical deployment of IRL. Consequently, improving sample efficiency has emerged as a critical challenge in multi-agent inverse reinforcement learning (MIRL). Inspired by the symmetry inherent in multi-agent systems, this work theoretically demonstrates that leveraging symmetry enables the recovery of more accurate reward functions. Building upon this insight, we propose a universal framework that integrates symmetry into existing multi-agent adversarial IRL algorithms, thereby significantly enhancing sample efficiency. Experimental results from multiple challenging tasks have demonstrated the effectiveness of this framework. Further validation in physical multi-robot systems has shown the practicality of our method.

Related papers

Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization [52.74762030521324]
We propose a novel algorithm to learn reward functions from observed actions.<n>We provide strong theoretical guarantees for the reliability and sample efficiency of our algorithm.
arXiv Detail & Related papers (2026-01-19T04:12:51Z)
Application of LLM Guided Reinforcement Learning in Formation Control with Collision Avoidance [1.1718316049475228]
Multi-Agent Systems (MAS) excel at accomplishing complex objectives through the collaborative efforts of individual agents.<n>In this paper, we introduce a novel framework that aims to overcome the challenge of designing an effective reward function.<n>By giving large language models (LLMs) on the prioritization of tasks, our framework generates reward functions that can be dynamically adjusted online.
arXiv Detail & Related papers (2025-07-22T09:26:00Z)
Next-Future: Sample-Efficient Policy Learning for Robotic-Arm Tasks [6.991281327290525]
We introduce a novel replay strategy, "Next-Future", which focuses on rewarding single-step transitions.<n>This approach significantly enhances sample efficiency and accuracy in learning multi-goal Markov decision processes.
arXiv Detail & Related papers (2025-04-15T14:45:51Z)
Attention-Augmented Inverse Reinforcement Learning with Graph Convolutions for Multi-Agent Task Allocation [0.29998889086656577]
Multi-agent task allocation (MATA) plays a vital role in cooperative multi-agent systems.<n>Inverse reinforcement learning (IRL)-based framework is proposed to enhance reward function learning and task execution efficiency.<n>Experiments validate the superiority of the proposed method over widely used multi-agent reinforcement learning (MARL) algorithms.
arXiv Detail & Related papers (2025-04-07T13:14:45Z)
Affordance-Guided Reinforcement Learning via Visual Prompting [51.361977466993345]
Keypoint-based Affordance Guidance for Improvements (KAGI) is a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL.<n>On diverse real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 30K online fine-tuning steps.
arXiv Detail & Related papers (2024-07-14T21:41:29Z)
RILe: Reinforced Imitation Learning [60.63173816209543]
RILe (Reinforced Learning) is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently.<n>Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z)
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity [114.88145406445483]
Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications. In practice the choice of reward function can be crucial for good results.
arXiv Detail & Related papers (2022-10-18T04:21:25Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
Learning Sparse Rewarded Tasks from Sub-Optimal Demonstrations [78.94386823185724]
Imitation learning learns effectively in sparse-rewarded tasks by leveraging the existing expert demonstrations. In practice, collecting a sufficient amount of expert demonstrations can be prohibitively expensive. We propose Self-Adaptive Learning (SAIL) that can achieve (near) optimal performance given only a limited number of sub-optimal demonstrations.
arXiv Detail & Related papers (2020-04-01T15:57:15Z)
Balance Between Efficient and Effective Learning: Dense2Sparse Reward Shaping for Robot Manipulation with Environment Uncertainty [14.178202899299267]
We propose a simple but powerful reward shaping method, namely Dense2Sparse. It combines the advantage of fast convergence of dense reward and the noise isolation of the sparse reward, to achieve a balance between learning efficiency and effectiveness. The experiment results show that the Dense2Sparse method obtained higher expected reward compared with the ones using standalone dense reward or sparse reward, and it also has a superior tolerance of system uncertainty.
arXiv Detail & Related papers (2020-03-05T16:10:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.