Related papers: Agile Flight Emerges from Multi-Agent Competitive Racing

Agile Flight Emerges from Multi-Agent Competitive Racing

URL: http://arxiv.org/abs/2512.11781v1
Date: Fri, 12 Dec 2025 18:48:50 GMT
Title: Agile Flight Emerges from Multi-Agent Competitive Racing
Authors: Vineet Pasumarti, Lorenzo Bianchi, Antonio Loquercio,
Abstract summary: We find that both agile flight and strategy emerge from agents trained with reinforcement learning.<n>We find that multi-agent competition yields policies that transfer more reliably to the real world than policies trained with a single-agent progress-based reward.
Score: 7.9331622838838305
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Through multi-agent competition and the sparse high-level objective of winning a race, we find that both agile flight (e.g., high-speed motion pushing the platform to its physical limits) and strategy (e.g., overtaking or blocking) emerge from agents trained with reinforcement learning. We provide evidence in both simulation and the real world that this approach outperforms the common paradigm of training agents in isolation with rewards that prescribe behavior, e.g., progress on the raceline, in particular when the complexity of the environment increases, e.g., in the presence of obstacles. Moreover, we find that multi-agent competition yields policies that transfer more reliably to the real world than policies trained with a single-agent progress-based reward, despite the two methods using the same simulation environment, randomization strategy, and hardware. In addition to improved sim-to-real transfer, the multi-agent policies also exhibit some degree of generalization to opponents unseen at training time. Overall, our work, following in the tradition of multi-agent competitive game-play in digital domains, shows that sparse task-level rewards are sufficient for training agents capable of advanced low-level control in the physical world. Code: https://github.com/Jirl-upenn/AgileFlight_MultiAgent

Related papers

MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation [64.2621682259008]
Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2)<n>We propose a Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2) to integrate policy learning with multi-agent tree search.<n>We show that MARTI-MARS2 achieves 77.7%, outperforming strong baselines like GPT-5.1 on challenging code generation benchmarks.
arXiv Detail & Related papers (2026-02-08T07:28:44Z)
Robust Agents in Open-Ended Worlds [4.199586801784625]
In this thesis, we harness methodologies from open-endedness and multi-agent learning to train and evaluate robust AI agents.<n>We begin by introducing MiniHack, a sandbox framework for creating diverse environments through procedural content generation.<n>We then present Maestro, a novel approach for generating adversarial curricula that progressively enhance the robustness and generality of RL agents in two-player zero-sum games.
arXiv Detail & Related papers (2025-12-09T00:30:33Z)
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning [129.44038804430542]
We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.<n>We propose ScalingInter-RL, a training approach designed for exploration-exploitation balance and stable RL optimization.<n>Our agents match or surpass commercial models on 27 tasks across diverse environments.
arXiv Detail & Related papers (2025-09-10T16:46:11Z)
Generalizable Agent Modeling for Agent Collaboration-Competition Adaptation with Multi-Retrieval and Dynamic Generation [19.74776726500979]
Adapting a single agent to a new multi-agent system brings challenges, necessitating adjustments across various tasks, environments, and interactions with unknown teammates and opponents.<n>We propose a more comprehensive setting, Agent Collaborative-Competitive Adaptation, which evaluates an agent to generalize across diverse scenarios.<n>In ACCA, agents adjust to task and environmental changes, collaborate with unseen teammates, and compete against unknown opponents.
arXiv Detail & Related papers (2025-06-20T03:28:18Z)
Multi-Agent Training for Pommerman: Curriculum Learning and Population-based Self-Play Approach [11.740631954398292]
Pommerman is an ideal benchmark for multi-agent training, providing a battleground for two teams with communication capabilities among allied agents.<n>This study introduces a system designed to train multi-agent systems to play Pommerman using a combination of curriculum learning and population-based self-play.
arXiv Detail & Related papers (2024-06-30T11:14:29Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games [13.060023718506917]
We develop a new multi-agent imitation learning model for predicting next moves of the opponents. We also present a new multi-agent reinforcement learning algorithm that combines our imitation learning model and policy training into one single training process. Experimental results show that our approach achieves superior performance compared to existing state-of-the-art multi-agent RL algorithms.
arXiv Detail & Related papers (2023-08-20T07:30:13Z)
Learning From Good Trajectories in Offline Multi-Agent Reinforcement Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets. One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team. We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z)
Coach-assisted Multi-Agent Reinforcement Learning Framework for Unexpected Crashed Agents [120.91291581594773]
We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes. We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training. To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
arXiv Detail & Related papers (2022-03-16T08:22:45Z)
Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z)
Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World [49.80905654161763]
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP) The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur. The ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible.
arXiv Detail & Related papers (2021-03-30T17:13:29Z)
Natural Emergence of Heterogeneous Strategies in Artificially Intelligent Competitive Teams [0.0]
We develop a competitive multi agent environment called FortAttack in which two teams compete against each other. We observe a natural emergence of heterogeneous behavior amongst homogeneous agents when such behavior can lead to the team's success. We propose ensemble training, in which we utilize the evolved opponent strategies to train a single policy for friendly agents.
arXiv Detail & Related papers (2020-07-06T22:35:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.