Diffusion-RL Based Air Traffic Conflict Detection and Resolution Method
- URL: http://arxiv.org/abs/2509.03550v1
- Date: Tue, 02 Sep 2025 23:17:46 GMT
- Title: Diffusion-RL Based Air Traffic Conflict Detection and Resolution Method
- Authors: Tonghe Li, Jixin Liu, Weili Zeng, Hao Jiang,
- Abstract summary: This paper proposes a novel autonomous conflict resolution framework named Diffusion-AC.<n>Our framework models its policy as a reverse denoising process guided by a value function, enabling it to generate a rich, high-quality, and multimodal action distribution.<n>Extensive simulation experiments demonstrate that the proposed method significantly outperforms a suite of state-of-the-art DRL benchmarks.
- Score: 5.477141500588868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the context of continuously rising global air traffic, efficient and safe Conflict Detection and Resolution (CD&R) is paramount for air traffic management. Although Deep Reinforcement Learning (DRL) offers a promising pathway for CD&R automation, existing approaches commonly suffer from a "unimodal bias" in their policies. This leads to a critical lack of decision-making flexibility when confronted with complex and dynamic constraints, often resulting in "decision deadlocks." To overcome this limitation, this paper pioneers the integration of diffusion probabilistic models into the safety-critical task of CD&R, proposing a novel autonomous conflict resolution framework named Diffusion-AC. Diverging from conventional methods that converge to a single optimal solution, our framework models its policy as a reverse denoising process guided by a value function, enabling it to generate a rich, high-quality, and multimodal action distribution. This core architecture is complemented by a Density-Progressive Safety Curriculum (DPSC), a training mechanism that ensures stable and efficient learning as the agent progresses from sparse to high-density traffic environments. Extensive simulation experiments demonstrate that the proposed method significantly outperforms a suite of state-of-the-art DRL benchmarks. Most critically, in the most challenging high-density scenarios, Diffusion-AC not only maintains a high success rate of 94.1% but also reduces the incidence of Near Mid-Air Collisions (NMACs) by approximately 59% compared to the next-best-performing baseline, significantly enhancing the system's safety margin. This performance leap stems from its unique multimodal decision-making capability, which allows the agent to flexibly switch to effective alternative maneuvers.
Related papers
- Intent-Context Synergy Reinforcement Learning for Autonomous UAV Decision-Making in Air Combat [2.9612776591672443]
This paper proposes an Intent-Context Synergy Reinforcement Learning (ICS-RL) framework for autonomous UAV infiltration in contested environments.<n>An LSTM-based Intent Prediction Module forecasts the future trajectories of hostile units, transforming the decision paradigm from reactive avoidance to proactive planning.<n>A Context-Analysis Synergy Mechanism decomposes the mission into hierarchical sub-tasks (safe cruise, stealth planning, and hostile breakthrough)<n>A dynamic switching controller based on Max-Advantage values seamlessly integrates these agents, allowing the UAV to adaptively select the optimal policy without hard-coded rules.
arXiv Detail & Related papers (2026-03-01T08:05:32Z) - Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty [22.020160934935493]
Fuz-RL is a fuzzy measure-guided robust framework for safe RL.<n>We show that Fuz-RL effectively integrates with existing safe RL baselines in a model-free manner.
arXiv Detail & Related papers (2026-02-24T09:50:17Z) - BarrierSteer: LLM Safety via Learning Barrier Steering [83.12893815611052]
BarrierSteer is a novel framework that formalizes safety by embedding learned non-linear safety constraints directly into the model's latent representation space.<n>We show that BarrierSteer substantially reduces adversarial success rates, decreases unsafe generations, and outperforms existing methods.
arXiv Detail & Related papers (2026-02-23T18:19:46Z) - Safe Reinforcement Learning via Recovery-based Shielding with Gaussian Process Dynamics Models [57.006252510102506]
Reinforcement learning (RL) is a powerful framework for optimal decision-making and control but often lacks provable guarantees for safety-critical applications.<n>We introduce a novel recovery-based shielding framework that enables safe RL with a provable safety lower bound for unknown and non-linear continuous dynamical systems.
arXiv Detail & Related papers (2026-02-12T22:03:35Z) - SecDiff: Diffusion-Aided Secure Deep Joint Source-Channel Coding Against Adversarial Attacks [73.41290017870097]
SecDiff is a plug-and-play, diffusion-aided decoding framework.<n>It significantly enhances the security and robustness of deep J SCC under adversarial wireless environments.
arXiv Detail & Related papers (2025-11-03T11:24:06Z) - Adversarial Diffusion for Robust Reinforcement Learning [46.44328012099217]
We introduce Adversarial Diffusion for Robust Reinforcement Learning (AD-RRL)<n>AD-RRL guides the diffusion process to generate worst-case trajectories during training, effectively optimizing the Conditional Value at Risk (CVaR) of the cumulative return.<n> Empirical results across standard benchmarks show that AD-RRL achieves superior robustness and performance compared to existing robust RL methods.
arXiv Detail & Related papers (2025-09-28T12:34:35Z) - The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward [57.56453588632619]
A central paradox in fine-tuning Large Language Models (LLMs) with Reinforcement Learning with Verifiable Reward (RLVR) is the frequent degradation of multi-attempt performance.<n>This is often accompanied by catastrophic forgetting, where models lose previously acquired skills.<n>We argue that standard RLVR objectives lack a crucial mechanism for knowledge retention.
arXiv Detail & Related papers (2025-09-09T06:34:32Z) - Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments [6.956559003734227]
An unmanned aerial vehicles (UAVs) has been exposed to adversarial attacks that exploit vulnerabilities in reinforcement learning (RL)<n>This paper introduces an antifragile RL framework that enhances adaptability to broader distributional shifts.<n>It achieves superior performance, demonstrating shorter navigation path lengths and a higher rate of conflict-free navigation trajectories.
arXiv Detail & Related papers (2025-06-26T10:06:29Z) - Efficient Beam Selection for ISAC in Cell-Free Massive MIMO via Digital Twin-Assisted Deep Reinforcement Learning [37.540612510652174]
We derive the distribution of joint target detection probabilities across multiple receiving APs under false alarm rate constraints.<n>We then formulate the beam selection procedure as a Markov decision process (MDP)<n>To eliminate the high costs and associated risks of real-time agent-environment interactions, we propose a novel digital twin (DT)-assisted offline DRL approach.
arXiv Detail & Related papers (2025-06-23T12:17:57Z) - Distributionally Robust Constrained Reinforcement Learning under Strong Duality [37.76993170360821]
We study the problem of Distributionally Robust Constrained RL (DRC-RL)
The goal is to maximize the expected reward subject to environmental distribution shifts and constraints.
We develop an algorithmic framework based on strong duality that enables the first efficient and provable solution.
arXiv Detail & Related papers (2024-06-22T08:51:57Z) - DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints [68.82294911302579]
We introduce DiveR-CT, which relaxes conventional constraints on the objective and semantic reward, granting greater freedom for the policy to enhance diversity.<n>Our experiments demonstrate DiveR-CT's marked superiority over baselines by 1) generating data that perform better in various diversity metrics across different attack success rate levels, 2) better-enhancing resiliency in blue team models through safety tuning based on collected data, 3) allowing dynamic control of objective weights for reliable and controllable attack success rates, and 4) reducing susceptibility to reward overoptimization.
arXiv Detail & Related papers (2024-05-29T12:12:09Z) - Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications [73.58451824894568]
The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states.
In safety-critical domains, such behaviors could lead to disastrous outcomes.
We propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic.
arXiv Detail & Related papers (2024-02-23T23:22:06Z) - Multi-agent deep reinforcement learning with centralized training and
decentralized execution for transportation infrastructure management [0.0]
We present a multi-agent Deep Reinforcement Learning (DRL) framework for managing large transportation infrastructure systems over their life-cycle.
Life-cycle management of such engineering systems is a computationally intensive task, requiring appropriate sequential inspection and maintenance decisions.
arXiv Detail & Related papers (2024-01-23T02:52:36Z) - Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning [48.667697255912614]
Mean-field reinforcement learning addresses the policy of a representative agent interacting with the infinite population of identical agents.
We propose Safe-M$3$-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions.
Our algorithm effectively meets the demand in critical areas while ensuring service accessibility in regions with low demand.
arXiv Detail & Related papers (2023-06-29T15:57:07Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.