Transformer-based Multi-agent Reinforcement Learning for Separation Assurance in Structured and Unstructured Airspaces
- URL: http://arxiv.org/abs/2601.04401v1
- Date: Wed, 07 Jan 2026 21:18:28 GMT
- Title: Transformer-based Multi-agent Reinforcement Learning for Separation Assurance in Structured and Unstructured Airspaces
- Authors: Arsyi Aziz, Peng Wei,
- Abstract summary: We show that a single encoder configuration can yield near-zero near mid-air collision rates and shorter loss-of-separation infringements than the deeper configurations.<n>Our results suggest that the newly formulated state representation, novel design of neural network architecture, and proposed training strategy provide an adaptable and scalable decentralized solution for aircraft separation assurance.
- Score: 3.719121868494767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional optimization-based metering depends on strict adherence to precomputed schedules, which limits the flexibility required for the stochastic operations of Advanced Air Mobility (AAM). In contrast, multi-agent reinforcement learning (MARL) offers a decentralized, adaptive framework that can better handle uncertainty, required for safe aircraft separation assurance. Despite this advantage, current MARL approaches often overfit to specific airspace structures, limiting their adaptability to new configurations. To improve generalization, we recast the MARL problem in a relative polar state space and train a transformer encoder model across diverse traffic patterns and intersection angles. The learned model provides speed advisories to resolve conflicts while maintaining aircraft near their desired cruising speeds. In our experiments, we evaluated encoder depths of 1, 2, and 3 layers in both structured and unstructured airspaces, and found that a single encoder configuration outperformed deeper variants, yielding near-zero near mid-air collision rates and shorter loss-of-separation infringements than the deeper configurations. Additionally, we showed that the same configuration outperforms a baseline model designed purely with attention. Together, our results suggest that the newly formulated state representation, novel design of neural network architecture, and proposed training strategy provide an adaptable and scalable decentralized solution for aircraft separation assurance in both structured and unstructured airspaces.
Related papers
- Rethinking Transferable Adversarial Attacks on Point Clouds from a Compact Subspace Perspective [55.919842734983156]
CoSA is a transferable attack framework that operates within a shared low-dimensional semantic space.<n>CoSA consistently outperforms state-of-the-art transferable attacks.
arXiv Detail & Related papers (2026-01-30T15:48:11Z) - MFC-RFNet: A Multi-scale Guided Rectified Flow Network for Radar Sequence Prediction [7.015114232190396]
Accurate high-resolution precipitation nowcasting from radar echo sequences is crucial for disaster mitigation and economic planning.<n>Key difficulties include modeling complex multi-scale evolution, inter-frame feature misalignment caused by displacement, and efficiently capturing long-range context.<n>We present the Multi-scale Feature Communication Rectified Flow Network (MFRF-Net), a generative framework that integrates multi-scale communication with guided feature fusion.
arXiv Detail & Related papers (2026-01-07T06:24:26Z) - CoCo-Fed: A Unified Framework for Memory- and Communication-Efficient Federated Learning at the Wireless Edge [50.42067935605982]
We propose a novel Compression and Combination-based Federated learning framework that unifies local memory efficiency and global communication reduction.<n>CoCo-Fed significantly outperforms state-of-the-art baselines in both memory and communication efficiency while maintaining robust convergence under non-IID settings.
arXiv Detail & Related papers (2026-01-02T03:39:50Z) - QoS-Aware Hierarchical Reinforcement Learning for Joint Link Selection and Trajectory Optimization in SAGIN-Supported UAV Mobility Management [52.15690855486153]
A space-air-ground integrated network (SAGIN) has emerged as an essential architecture for enabling ubiquitous UAV connectivity.<n>This paper formulates UAV mobility management in SAGIN as a constrained multiobjective joint optimization problem.
arXiv Detail & Related papers (2025-12-17T06:22:46Z) - Multi-Phase Spacecraft Trajectory Optimization via Transformer-Based Reinforcement Learning [2.034091340570242]
This work introduces a transformer-based RL framework that unifies multi-phase trajectory optimization through a single policy architecture.<n>Results demonstrate that the transformer-based framework not only matches analytical solutions in simple cases but also effectively learns coherent control policies across dynamically distinct regimes.
arXiv Detail & Related papers (2025-11-14T15:29:46Z) - SA-EMO: Structure-Aligned Encoder Mixture of Operators for Generalizable Full-waveform Inversion [0.0]
Full-waveform inversion can produce high-resolution models, but it remains inherently ill-posed, highly nonlinear, and computationally intensive.<n>We propose a Structure-Aligned-Mixture-of-Operators (SA-EMO) architecture for velocity-field inversion under unknown subsurface structures.<n>SA-EMO significantly outperforms traditional CNN or single-operator methods, achieving an average MAE reduction of approximately 58.443% and an improvement in boundary resolution of about 10.308%.
arXiv Detail & Related papers (2025-11-07T14:03:43Z) - Byzantine-Resilient Over-the-Air Federated Learning under Zero-Trust Architecture [68.83934802584899]
We propose a novel Byzantine-robust FL paradigm for over-the-air transmissions, referred to as federated learning with secure adaptive clustering (FedSAC)<n>FedSAC aims to protect a portion of the devices from attacks through zero trust architecture (ZTA) based Byzantine identification and adaptive device clustering.<n> Numerical results substantiate the superiority of the proposed FedSAC over existing methods in terms of both test accuracy and convergence rate.
arXiv Detail & Related papers (2025-03-24T01:56:30Z) - Forward Once for All: Structural Parameterized Adaptation for Efficient Cloud-coordinated On-device Recommendation [26.353286155116116]
Forward-OFA is a novel approach for the dynamic construction of device-specific networks.<n>It establishes a structure-guided mapping of real-time behaviors to the parameters of assembled networks.<n>Experiments on real-world datasets demonstrate the effectiveness and efficiency of Forward-OFA.
arXiv Detail & Related papers (2025-01-06T08:32:16Z) - A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies [51.7643024367548]
Stable Diffusion Model is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation.
This study focuses on reducing redundant computation in SDM and optimizing the model through both tuning and tuning-free methods.
arXiv Detail & Related papers (2024-05-31T21:47:05Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - An Adaptive Fuzzy Reinforcement Learning Cooperative Approach for the
Autonomous Control of Flock Systems [4.961066282705832]
This work introduces an adaptive distributed robustness technique for the autonomous control of flock systems.
Its relatively flexible structure is based on online fuzzy reinforcement learning schemes which simultaneously target a number of objectives.
In addition to its resilience in the face of dynamic disturbances, the algorithm does not require more than the agent position as a feedback signal.
arXiv Detail & Related papers (2023-03-17T13:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.