QoS-Aware Hierarchical Reinforcement Learning for Joint Link Selection and Trajectory Optimization in SAGIN-Supported UAV Mobility Management
- URL: http://arxiv.org/abs/2512.15119v1
- Date: Wed, 17 Dec 2025 06:22:46 GMT
- Title: QoS-Aware Hierarchical Reinforcement Learning for Joint Link Selection and Trajectory Optimization in SAGIN-Supported UAV Mobility Management
- Authors: Jiayang Wan, Ke He, Yafei Wang, Fan Liu, Wenjin Wang, Shi Jin,
- Abstract summary: A space-air-ground integrated network (SAGIN) has emerged as an essential architecture for enabling ubiquitous UAV connectivity.<n>This paper formulates UAV mobility management in SAGIN as a constrained multiobjective joint optimization problem.
- Score: 52.15690855486153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the significant variations in unmanned aerial vehicle (UAV) altitude and horizontal mobility, it becomes difficult for any single network to ensure continuous and reliable threedimensional coverage. Towards that end, the space-air-ground integrated network (SAGIN) has emerged as an essential architecture for enabling ubiquitous UAV connectivity. To address the pronounced disparities in coverage and signal characteristics across heterogeneous networks, this paper formulates UAV mobility management in SAGIN as a constrained multi-objective joint optimization problem. The formulation couples discrete link selection with continuous trajectory optimization. Building on this, we propose a two-level multi-agent hierarchical deep reinforcement learning (HDRL) framework that decomposes the problem into two alternately solvable subproblems. To map complex link selection decisions into a compact discrete action space, we conceive a double deep Q-network (DDQN) algorithm in the top-level, which achieves stable and high-quality policy learning through double Q-value estimation. To handle the continuous trajectory action space while satisfying quality of service (QoS) constraints, we integrate the maximum-entropy mechanism of the soft actor-critic (SAC) and employ a Lagrangian-based constrained SAC (CSAC) algorithm in the lower-level that dynamically adjusts the Lagrange multipliers to balance constraint satisfaction and policy optimization. Moreover, the proposed algorithm can be extended to multi-UAV scenarios under the centralized training and decentralized execution (CTDE) paradigm, which enables more generalizable policies. Simulation results demonstrate that the proposed scheme substantially outperforms existing benchmarks in throughput, link switching frequency and QoS satisfaction.
Related papers
- Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks [77.17664010626726]
We focus on the routing with multiple UAV clusters in low-altitude intelligent networks (LAINs)<n>To minimize the damage caused by potential threats, we present the zero-trust architecture with the software-defined perimeter and blockchain techniques.<n>We show that the proposed framework reduces the average E2E delay by 59% and improves the TSR by 29% on average compared to benchmarks.
arXiv Detail & Related papers (2026-02-27T04:30:35Z) - OmniVL-Guard: Towards Unified Vision-Language Forgery Detection and Grounding via Balanced RL [63.388513841293616]
Existing forgery detection methods fail to handle the interleaved text, images, and videos prevalent in real-world misinformation.<n>To bridge this gap, this paper targets to develop a unified framework for omnibus vision-language forgery detection and grounding.<n>We propose textbf OmniVL-Guard, a balanced reinforcement learning framework for omnibus vision-language forgery detection and grounding.
arXiv Detail & Related papers (2026-02-11T09:41:36Z) - Hierarchical Task Offloading and Trajectory Optimization in Low-Altitude Intelligent Networks Via Auction and Diffusion-based MARL [37.79695337425523]
Low-altitude intelligent networks (LAINs) can support mission-critical applications such as disaster response, environmental monitoring, and real-time sensing.<n>These systems face key challenges, including energy-constrained UAVs, task arrivals, and heterogeneous computing resources.<n>We propose an integrated air-ground collaborative network and formulate a time-dependent integer nonlinear programming problem that jointly optimize UAV trajectory planning and task offloading decisions.
arXiv Detail & Related papers (2025-12-05T08:14:45Z) - Backscatter Device-aided Integrated Sensing and Communication: A Pareto Optimization Framework [59.30060797118097]
Integrated sensing and communication (ISAC) systems potentially encounter significant performance degradation in densely obstructed urban non-line-of-sight scenarios.<n>This paper proposes a backscatter approximation (BD)-assisted ISAC system, which leverages passive BDs naturally distributed in environments of enhancement.
arXiv Detail & Related papers (2025-07-12T17:11:06Z) - Hierarchical Task Offloading for UAV-Assisted Vehicular Edge Computing via Deep Reinforcement Learning [11.695622067301128]
We propose a dual-layer UAV-assisted edge computing architecture based on partial offloading.<n>The proposed architecture enables efficient integration and coordination of heterogeneous resources.<n>We show that the proposed approach outperforms several baselines in task completion rate, system efficiency, and convergence speed.
arXiv Detail & Related papers (2025-07-08T07:10:52Z) - Generative AI-Enhanced Cooperative MEC of UAVs and Ground Stations for Unmanned Surface Vehicles [36.3157805511305]
Unmanned surface vehicles (USVs) offer low-cost, flexible aerial services.<n>Ground stations (GSs) can provide powerful supports, which can cooperate to help the USVs in complex scenarios.<n>We propose a cooperative UAV and GS based robust multi-access edge computing framework to assist USVs in completing computational tasks.
arXiv Detail & Related papers (2025-02-12T04:42:59Z) - CVaR-Based Variational Quantum Optimization for User Association in Handoff-Aware Vehicular Networks [23.140655547353994]
We present a novel Conditional Value at Risk (CVaR)-based Variational Quantum Eigensolver (VQE) framework to address generalized assignment problems (GAP) in vehicular networks (VNets)<n>Our approach leverages a hybrid quantum-classical structure, integrating a tailored cost function that balances both objective and constraint-specific penalties to improve solution quality and stability.<n>We apply this framework to a user-association problem in VNets, where our method achieves 23.5% improvement compared to the deep neural network (DNN) approach.
arXiv Detail & Related papers (2025-01-14T20:21:06Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Deep-Reinforcement-Learning-Based AoI-Aware Resource Allocation for RIS-Aided IoV Networks [43.443526528832145]
We propose a RIS-assisted internet of vehicles (IoV) network, considering the vehicle-to-everything (V2X) communication method.<n>In order to improve the timeliness of vehicle-to-infrastructure (V2I) links and the stability of vehicle-to-vehicle (V2V) links, we introduce the age of information (AoI) model and the payload transmission probability model.
arXiv Detail & Related papers (2024-06-17T06:16:07Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Goal Kernel Planning: Linearly-Solvable Non-Markovian Policies for Logical Tasks with Goal-Conditioned Options [54.40780660868349]
We introduce a compositional framework called Linearly-Solvable Goal Kernel Dynamic Programming (LS-GKDP)<n>LS-GKDP combines the Linearly-Solvable Markov Decision Process (LMDP) formalism with the Options Framework of Reinforcement Learning.<n>We show how an LMDP with a goal kernel enables the efficient optimization of meta-policies in a lower-dimensional subspace defined by the task grounding.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.