Related papers: Light Aircraft Game : Basic Implementation and training results analysis

Light Aircraft Game : Basic Implementation and training results analysis

URL: http://arxiv.org/abs/2506.14164v1
Date: Tue, 17 Jun 2025 03:57:28 GMT
Title: Light Aircraft Game : Basic Implementation and training results analysis
Authors: Hanzhong Cao,
Abstract summary: This paper investigates multi-agent reinforcement learning (MARL) in a partially observable, cooperative-competitive combat environment known as LAG.<n>We describe the environment's setup, including agent actions, hierarchical controls, and reward design across different combat modes such as No Weapon and ShootMissile.<n>Two representative algorithms are evaluated: HAPPO, an on-policy hierarchical variant of PPO, and HASAC, an off-policy method based on soft actor-critic.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates multi-agent reinforcement learning (MARL) in a partially observable, cooperative-competitive combat environment known as LAG. We describe the environment's setup, including agent actions, hierarchical controls, and reward design across different combat modes such as No Weapon and ShootMissile. Two representative algorithms are evaluated: HAPPO, an on-policy hierarchical variant of PPO, and HASAC, an off-policy method based on soft actor-critic. We analyze their training stability, reward progression, and inter-agent coordination capabilities. Experimental results show that HASAC performs well in simpler coordination tasks without weapons, while HAPPO demonstrates stronger adaptability in more dynamic and expressive scenarios involving missile combat. These findings provide insights into the trade-offs between on-policy and off-policy methods in multi-agent settings.

Related papers

Reinforcement Learning for Decision-Level Interception Prioritization in Drone Swarm Defense [56.47577824219207]
We present a case study demonstrating the practical advantages of reinforcement learning in addressing this challenge.<n>We introduce a high-fidelity simulation environment that captures realistic operational constraints.<n>Agent learns to coordinate multiple effectors for optimal interception prioritization.<n>We evaluate the learned policy against a handcrafted rule-based baseline across hundreds of simulated attack scenarios.
arXiv Detail & Related papers (2025-08-01T13:55:39Z)
Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement Learning [15.539607264374242]
Multi-agent reinforcement learning (MARL) has achieved strong performance in cooperative adversarial tasks.<n>We propose a dynamic curriculum learning framework that employs an self-adaptive difficulty adjustment mechanism.<n>Our method improves both training stability and final performance, achieving competitive results against state-of-the-art methods.
arXiv Detail & Related papers (2025-06-09T08:38:18Z)
Enhancing Aerial Combat Tactics through Hierarchical Multi-Agent Reinforcement Learning [38.15185397658309]
This work presents a Hierarchical Multi-Agent Reinforcement Learning framework for analyzing simulated air combat scenarios.<n>The objective is to identify effective Courses of Action that lead to mission success within preset simulations.
arXiv Detail & Related papers (2025-05-13T22:13:48Z)
A Hierarchical Reinforcement Learning Framework for Multi-UAV Combat Using Leader-Follower Strategy [3.095786524987445]
Multi-UAV air combat is a complex task involving multiple autonomous UAVs.<n>Previous approaches predominantly discretize the action space into predefined actions.<n>We propose a hierarchical framework utilizing the Leader-Follower Multi-Agent Proximal Policy Optimization strategy.
arXiv Detail & Related papers (2025-01-22T02:41:36Z)
Data-Driven Distributed Common Operational Picture from Heterogeneous Platforms using Multi-Agent Reinforcement Learning [1.3469274919926262]
The integration of unmanned platforms promises to enhance situational awareness and mitigate the "fog of war" in military operations. Managing the vast influx of data from these platforms poses a significant challenge for Command and Control (C2) systems. This study presents a novel multi-agent learning framework to address this challenge.
arXiv Detail & Related papers (2024-11-08T16:31:22Z)
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm. HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies. HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
Decomposed Soft Actor-Critic Method for Cooperative Multi-Agent Reinforcement Learning [10.64928897082273]
Experimental results demonstrate that mSAC significantly outperforms policy-based approach COMA. In addition, mSAC achieves pretty good results on large action space tasks, such as 2c_vs_64zg and MMM2.
arXiv Detail & Related papers (2021-04-14T07:02:40Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC) It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure. We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.