Related papers: Reinforcement Learning Based Self-play and State Stacking Techniques for Noisy Air Combat Environment

Reinforcement Learning Based Self-play and State Stacking Techniques for Noisy Air Combat Environment

URL: http://arxiv.org/abs/2303.03068v1
Date: Mon, 6 Mar 2023 12:23:23 GMT
Title: Reinforcement Learning Based Self-play and State Stacking Techniques for Noisy Air Combat Environment
Authors: Ahmet Semih Tasbas, Safa Onur Sahin, Nazim Kemal Ure
Abstract summary: The complexity of air combat arises from aggressive close-range maneuvers and agile enemy behaviors. In this study, we developed an air combat simulation, which provides noisy observations to the agents. We present a state stacking method for noisy RL environments as a noise reduction technique.
Score: 1.7403133838762446
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning (RL) has recently proven itself as a powerful instrument for solving complex problems and even surpassed human performance in several challenging applications. This signifies that RL algorithms can be used in the autonomous air combat problem, which has been studied for many years. The complexity of air combat arises from aggressive close-range maneuvers and agile enemy behaviors. In addition to these complexities, there may be uncertainties in real-life scenarios due to sensor errors, which prevent estimation of the actual position of the enemy. In this case, autonomous aircraft should be successful even in the noisy environments. In this study, we developed an air combat simulation, which provides noisy observations to the agents, therefore, make the air combat problem even more challenging. Thus, we present a state stacking method for noisy RL environments as a noise reduction technique. In our extensive set of experiments, the proposed method significantly outperforms the baseline algorithms in terms of the winning ratio, where the performance improvement is even more pronounced in the high noise levels. In addition, we incorporate a self-play scheme to our training process by periodically updating the enemy with a frozen copy of the training agent. By this way, the training agent performs air combat simulations to an enemy with smarter strategies, which improves the performance and robustness of the agents. In our simulations, we demonstrate that the self-play scheme provides important performance gains compared to the classical RL training.

Related papers

Reinforcement Learning for Decision-Level Interception Prioritization in Drone Swarm Defense [56.47577824219207]
We present a case study demonstrating the practical advantages of reinforcement learning in addressing this challenge.<n>We introduce a high-fidelity simulation environment that captures realistic operational constraints.<n>Agent learns to coordinate multiple effectors for optimal interception prioritization.<n>We evaluate the learned policy against a handcrafted rule-based baseline across hundreds of simulated attack scenarios.
arXiv Detail & Related papers (2025-08-01T13:55:39Z)
Intercepting Unauthorized Aerial Robots in Controlled Airspace Using Reinforcement Learning [2.519319150166215]
The proliferation of unmanned aerial vehicles (UAVs) in controlled airspace presents significant risks. This work addresses the need for robust, adaptive systems capable of managing such threats through the use of Reinforcement Learning (RL) We present a novel approach utilizing RL to train fixed-wing UAV pursuer agents for intercepting dynamic evader targets.
arXiv Detail & Related papers (2024-07-09T14:45:47Z)
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning. Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques. Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z)
Adversarial Attacks on Reinforcement Learning Agents for Command and Control [6.05332129899857]
Recent work has shown that learning based approaches are highly susceptible to adversarial perturbations. In this paper, we investigate the robustness of an agent trained for a Command and Control task in an environment controlled by an adversary. We empirically show that an agent trained using these algorithms is highly susceptible to noise injected by the adversary.
arXiv Detail & Related papers (2024-05-02T19:28:55Z)
Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents. We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead. Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z)
Autonomous Agent for Beyond Visual Range Air Combat: A Deep Reinforcement Learning Approach [0.2578242050187029]
This work contributes to developing an agent based on deep reinforcement learning capable of acting in a beyond visual range (BVR) air combat simulation environment. The paper presents an overview of building an agent representing a high-performance fighter aircraft that can learn and improve its role in BVR combat over time. It also hopes to examine a real pilot's ability, using virtual simulation, to interact in the same environment with the trained agent and compare their performances.
arXiv Detail & Related papers (2023-04-19T13:54:37Z)
Anchored Learning for On-the-Fly Adaptation -- Extended Technical Report [45.123633153460034]
This study presents "anchor critics", a novel strategy for enhancing the robustness of reinforcement learning (RL) agents in crossing the sim-to-real gap. We identify that naive fine-tuning approaches lead to catastrophic forgetting, where policies maintain high rewards on frequently encountered states but lose performance on rarer, yet critical scenarios. Evaluations demonstrate that our approach enables behavior retention in sim-to-sim gymnasium tasks and in sim-to-real scenarios with racing quadrotors, achieving a near-50% reduction in power consumption while maintaining controllable, stable flight.
arXiv Detail & Related papers (2023-01-17T16:16:53Z)
Reinforcement Learning for UAV control with Policy and Reward Shaping [0.7127008801193563]
This study teaches an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously. The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach.
arXiv Detail & Related papers (2022-12-06T14:46:13Z)
Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC) Our algorithm alleviates problems with local minima through a smooth critic function. We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z)
Homotopy Based Reinforcement Learning with Maximum Entropy for Autonomous Air Combat [3.839929995011407]
The reinforcement learning (RL) method can significantly shorten the decision time via using neural networks. The sparse reward problem limits its convergence speed and the artificial prior experience reward can easily deviate its optimal convergent direction of the original task. We propose a homotopy-based soft actor-critic method (HSAC) which focuses on addressing these problems via following the homotopy path between the original task with sparse reward and the auxiliary task with artificial prior experience reward.
arXiv Detail & Related papers (2021-12-01T09:37:55Z)
Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training [71.7750435554693]
We show that several state-of-the-art RL agents proposed for power system control are vulnerable to adversarial attacks. Specifically, we use an adversary Markov Decision Process to learn an attack policy, and demonstrate the potency of our attack. We propose to use adversarial training to increase the robustness of RL agent against attacks and avoid infeasible operational decisions.
arXiv Detail & Related papers (2021-10-18T00:50:34Z)
Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs. We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.