Explainable and Safe Reinforcement Learning for Autonomous Air Mobility
- URL: http://arxiv.org/abs/2211.13474v1
- Date: Thu, 24 Nov 2022 08:47:06 GMT
- Title: Explainable and Safe Reinforcement Learning for Autonomous Air Mobility
- Authors: Lei Wang, Hongyu Yang, Yi Lin, Suwan Yin, Yuankai Wu
- Abstract summary: This article presents a novel deep reinforcement learning (DRL) controller to aid conflict resolution for autonomous free flight.
We design a fully explainable DRL framework wherein we: 1) decompose the coupled Q value learning model into a safety-awareness and efficiency (reach the target) one.
We also propose an adversarial attack strategy that can impose both safety-oriented and efficiency-oriented attacks.
- Score: 13.038383326602764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Increasing traffic demands, higher levels of automation, and communication
enhancements provide novel design opportunities for future air traffic
controllers (ATCs). This article presents a novel deep reinforcement learning
(DRL) controller to aid conflict resolution for autonomous free flight.
Although DRL has achieved important advancements in this field, the existing
works pay little attention to the explainability and safety issues related to
DRL controllers, particularly the safety under adversarial attacks. To address
those two issues, we design a fully explainable DRL framework wherein we: 1)
decompose the coupled Q value learning model into a safety-awareness and
efficiency (reach the target) one; and 2) use information from surrounding
intruders as inputs, eliminating the needs of central controllers. In our
simulated experiments, we show that by decoupling the safety-awareness and
efficiency, we can exceed performance on free flight control tasks while
dramatically improving explainability on practical. In addition, the safety Q
learning module provides rich information about the safety situation of
environments. To study the safety under adversarial attacks, we additionally
propose an adversarial attack strategy that can impose both safety-oriented and
efficiency-oriented attacks. The adversarial aims to minimize safety/efficiency
by only attacking the agent at a few time steps. In the experiments, our attack
strategy increases as many collisions as the uniform attack (i.e., attacking at
every time step) by only attacking the agent four times less often, which
provide insights into the capabilities and restrictions of the DRL in future
ATC designs. The source code is publicly available at
https://github.com/WLeiiiii/Gym-ATC-Attack-Project.
Related papers
- Defining and Evaluating Physical Safety for Large Language Models [62.4971588282174]
Large Language Models (LLMs) are increasingly used to control robotic systems such as drones.
Their risks of causing physical threats and harm in real-world applications remain unexplored.
We classify the physical safety risks of drones into four categories: (1) human-targeted threats, (2) object-targeted threats, (3) infrastructure attacks, and (4) regulatory violations.
arXiv Detail & Related papers (2024-11-04T17:41:25Z) - Intercepting Unauthorized Aerial Robots in Controlled Airspace Using Reinforcement Learning [2.519319150166215]
The proliferation of unmanned aerial vehicles (UAVs) in controlled airspace presents significant risks.
This work addresses the need for robust, adaptive systems capable of managing such threats through the use of Reinforcement Learning (RL)
We present a novel approach utilizing RL to train fixed-wing UAV pursuer agents for intercepting dynamic evader targets.
arXiv Detail & Related papers (2024-07-09T14:45:47Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Safe Reinforcement Learning using Data-Driven Predictive Control [0.5459797813771499]
We propose a data-driven safety layer that acts as a filter for unsafe actions.
The safety layer penalizes the RL agent if the proposed action is unsafe and replaces it with the closest safe one.
In a simulation, we show that our method outperforms state-of-the-art safe RL methods on the robotics navigation problem.
arXiv Detail & Related papers (2022-11-20T17:10:40Z) - SAFER: Safe Collision Avoidance using Focused and Efficient Trajectory
Search with Reinforcement Learning [34.934606949086096]
We present SAFER, an efficient and effective collision avoidance system.
It combines real-world reinforcement learning (RL), search-based online trajectory planning, and automatic emergency intervention.
Our real-world experiments show that, when compared with several baselines, our approach enjoys a higher average speed, lower crash rate, less emergency intervention, smaller overhead, and smoother overall control.
arXiv Detail & Related papers (2022-09-23T18:08:08Z) - Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the
Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers.
We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions.
We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Dirty Road Can Attack: Security of Deep Learning based Automated Lane
Centering under Physical-World Attack [38.3805893581568]
We study the security of state-of-the-art deep learning based ALC systems under physical-world adversarial attacks.
We formulate the problem with a safety-critical attack goal, and a novel and domain-specific attack vector: dirty road patches.
We evaluate our attack on a production ALC using 80 scenarios from real-world driving traces.
arXiv Detail & Related papers (2020-09-14T19:22:39Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.