Related papers: Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees

Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees

URL: http://arxiv.org/abs/2507.08653v1
Date: Fri, 11 Jul 2025 14:57:37 GMT
Title: Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees
Authors: Berire Gunes Reyhan, Sinem Coleri,
Abstract summary: This paper presents a novel optimization theory-based safe deep reinforcement learning (DRL) framework for ultra-reliable Wireless Networked Control Systems (WNCSs)<n>The framework minimizes power consumption under key constraints, including Peak Age of Information (PAoI) violation probability, transmit power, and schedulability in the finite blocklength regime.<n>The proposed framework outperforms rule-based and other optimization theory based DRL benchmarks, achieving faster convergence, higher rewards, and greater stability.
Score: 10.177917426690701
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In Wireless Networked Control Systems (WNCSs), control and communication systems must be co-designed due to their strong interdependence. This paper presents a novel optimization theory-based safe deep reinforcement learning (DRL) framework for ultra-reliable WNCSs, ensuring constraint satisfaction while optimizing performance, for the first time in the literature. The approach minimizes power consumption under key constraints, including Peak Age of Information (PAoI) violation probability, transmit power, and schedulability in the finite blocklength regime. PAoI violation probability is uniquely derived by combining stochastic maximum allowable transfer interval (MATI) and maximum allowable packet delay (MAD) constraints in a multi-sensor network. The framework consists of two stages: optimization theory and safe DRL. The first stage derives optimality conditions to establish mathematical relationships among variables, simplifying and decomposing the problem. The second stage employs a safe DRL model where a teacher-student framework guides the DRL agent (student). The control mechanism (teacher) evaluates compliance with system constraints and suggests the nearest feasible action when needed. Extensive simulations show that the proposed framework outperforms rule-based and other optimization theory based DRL benchmarks, achieving faster convergence, higher rewards, and greater stability.

Related papers

A Hybrid Reinforcement Learning Framework for Hard Latency Constrained Resource Scheduling [7.586600116278698]
We propose a novel reinforcement learning framework for resource scheduling with hard latency constraints (HRL-RSHLC)<n>We show that HRL-RSHLC can achieve superior performance with faster convergence speed compared to baseline algorithms.
arXiv Detail & Related papers (2025-03-30T09:39:13Z)
Digital Twin-Assisted Federated Learning with Blockchain in Multi-tier Computing Systems [67.14406100332671]
In Industry 4.0 systems, resource-constrained edge devices engage in frequent data interactions. This paper proposes a digital twin (DT) and federated digital twin (FL) scheme. The efficacy of our proposed cooperative interference-based FL process has been verified through numerical analysis.
arXiv Detail & Related papers (2024-11-04T17:48:02Z)
Event-Triggered Reinforcement Learning Based Joint Resource Allocation for Ultra-Reliable Low-Latency V2X Communications [10.914558012458425]
6G-enabled vehicular networks face the challenge ensuring low-latency communication (URLLC) for delivering safety-critical information in a timely manner. Traditional resource allocation schemes for vehicle-to-everything (V2X) communication systems rely on traditional decoding-based algorithms.
arXiv Detail & Related papers (2024-07-18T23:55:07Z)
Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration. A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z)
One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem. We do so by pre-optimizing a smooth and convex dual function that has a closed form. Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z)
Self-Supervised Learning for Large-Scale Preventive Security Constrained DC Optimal Power Flow [20.078717680640214]
Security-Constrained Optimal Power Flow (SCOPF) plays a crucial role in power grid stability but becomes increasingly complex as systems grow. This paper introduces PDL-SCOPF, a self-supervised end-to-end primal-dual learning framework for producing near-optimal solutions to large-scale SCOPF problems.
arXiv Detail & Related papers (2023-11-29T20:36:35Z)
Optimization Theory Based Deep Reinforcement Learning for Resource Allocation in Ultra-Reliable Wireless Networked Control Systems [10.177917426690701]
This paper introduces a novel optimization theory based deep reinforcement learning (DRL) framework for the joint design of controller and communication systems. The objective of minimum power consumption is targeted while satisfying the schedulability and rate constraints of the communication system.
arXiv Detail & Related papers (2023-11-28T15:49:29Z)
Learning Predictive Safety Filter via Decomposition of Robust Invariant Set [6.94348936509225]
This paper presents advantages of both RMPC and RL RL to synthesize safety filters for nonlinear systems. We propose a policy approach for robust reach problems and establish its complexity.
arXiv Detail & Related papers (2023-11-12T08:11:28Z)
False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm. We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.