Safe and Sustainable Electric Bus Charging Scheduling with Constrained Hierarchical DRL
- URL: http://arxiv.org/abs/2512.03059v1
- Date: Tue, 25 Nov 2025 20:00:02 GMT
- Title: Safe and Sustainable Electric Bus Charging Scheduling with Constrained Hierarchical DRL
- Authors: Jiaju Qi, Lei Lei, Thorsteinn Jonsson, Dusit Niyato,
- Abstract summary: Electric Buses (EBs) with renewable energy sources such as photovoltaic (PV) panels is a promising approach to promote sustainable and low-carbon public transportation.<n>We propose a safe Deep Reinforcement Learning framework for solving the EB Charging Scheduling Problem (EBCSP) under multi-source uncertainties.<n>We develop a novel HDRL algorithm, namely Double ActorCritic MultiAgent Proximal Policy Optimization Lagrangian (DACMAPPO-Lagrangian)
- Score: 43.715336081857394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The integration of Electric Buses (EBs) with renewable energy sources such as photovoltaic (PV) panels is a promising approach to promote sustainable and low-carbon public transportation. However, optimizing EB charging schedules to minimize operational costs while ensuring safe operation without battery depletion remains challenging - especially under real-world conditions, where uncertainties in PV generation, dynamic electricity prices, variable travel times, and limited charging infrastructure must be accounted for. In this paper, we propose a safe Hierarchical Deep Reinforcement Learning (HDRL) framework for solving the EB Charging Scheduling Problem (EBCSP) under multi-source uncertainties. We formulate the problem as a Constrained Markov Decision Process (CMDP) with options to enable temporally abstract decision-making. We develop a novel HDRL algorithm, namely Double Actor-Critic Multi-Agent Proximal Policy Optimization Lagrangian (DAC-MAPPO-Lagrangian), which integrates Lagrangian relaxation into the Double Actor-Critic (DAC) framework. At the high level, we adopt a centralized PPO-Lagrangian algorithm to learn safe charger allocation policies. At the low level, we incorporate MAPPO-Lagrangian to learn decentralized charging power decisions under the Centralized Training and Decentralized Execution (CTDE) paradigm. Extensive experiments with real-world data demonstrate that the proposed approach outperforms existing baselines in both cost minimization and safety compliance, while maintaining fast convergence speed.
Related papers
- Resource-constrained Project Scheduling with Time-of-Use Energy Tariffs and Machine States: A Logic-based Benders Decomposition Approach [0.0]
Resource-Constrained Project Scheduling Problem (RCPSP) with time-of-use energy tariffs (TOU) and machine states.<n>We propose two novel approaches to solve it: a monolithic Constraint Programming (CP) approach and a Logic-Based Benders Decomposition (LBBD) approach.
arXiv Detail & Related papers (2026-01-10T11:47:56Z) - Cost Minimization for Space-Air-Ground Integrated Multi-Access Edge Computing Systems [60.586531406445744]
Space-air-ground integrated multi-altitude edge computing (SAGIN-MEC) provides a promising solution for the rapidly developing low-altitude economy.<n>We present a SAGIN-MEC architecture that enables the coordination between user devices (UDs), uncrewed aerial vehicles (UAVs) and satellites.
arXiv Detail & Related papers (2025-10-24T15:03:07Z) - Optimizing Electric Bus Charging Scheduling with Uncertainties Using Hierarchical Deep Reinforcement Learning [46.15490780173541]
Electric Buses (EBs) represent a significant step toward sustainable development.<n>By utilizing Internet of Things (IoT) systems, charging stations can autonomously determine charging schedules based on real-time data.<n>However, optimizing EB charging schedules remains a critical challenge due to uncertainties in travel time, energy consumption, and fluctuating electricity prices.
arXiv Detail & Related papers (2025-05-15T13:44:27Z) - Electric Bus Charging Schedules Relying on Real Data-Driven Targets Based on Hierarchical Deep Reinforcement Learning [46.15490780173541]
The charging scheduling problem of Electric Buses (EBs) is investigated based on Deep Reinforcement Learning (DRL)<n>A high-level agent learns an effective policy for prescribing the charging targets for every charging period, while the low-level agent learns an optimal policy for setting the charging power of every time step within a single charging period.<n>It is proved that the flat policy constructed by superimposing the optimal high-level policy and the optimal low-level policy performs as well as the optimal policy of the original MDP.
arXiv Detail & Related papers (2025-05-15T13:13:41Z) - Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach [50.52139512096988]
6G Internet of Things (IoT) networks face challenges in remote areas and disaster scenarios where ground infrastructure is unavailable.<n>This paper proposes a novel aerial unmanned vehicle (UAV)-assisted computing (MEC) system enhanced by directional antennas to provide both computational and energy support for ground edge terminals.
arXiv Detail & Related papers (2025-05-06T06:46:19Z) - Reinforcement Learning-based Approach for Vehicle-to-Building Charging with Heterogeneous Agents and Long Term Rewards [3.867907469895697]
We introduce a novel RL framework that combines the Deep Deterministic Policy Gradient approach with action masking and efficient MILP-driven policy guidance.<n>Our approach balances the exploration of continuous action spaces to meet user charging demands.<n>Our results show that the proposed approach is one of the first scalable and general approaches to solving the V2B energy management challenge.
arXiv Detail & Related papers (2025-02-24T19:24:41Z) - Centralized vs. Decentralized Multi-Agent Reinforcement Learning for Enhanced Control of Electric Vehicle Charging Networks [1.9188272016043582]
We introduce a novel approach for distributed and cooperative charging strategy using a Multi-Agent Reinforcement Learning (MARL) framework.
Our method is built upon the Deep Deterministic Policy Gradient (DDPG) algorithm for a group of EVs in a residential community.
Our results indicate that, despite higher policy variances and training complexity, the CTDE-DDPG framework significantly improves charging efficiency by reducing total variation by approximately %36 and charging cost by around %9.1 on average.
arXiv Detail & Related papers (2024-04-18T21:50:03Z) - A Deep Reinforcement Learning-Based Charging Scheduling Approach with
Augmented Lagrangian for Electric Vehicle [2.686271754751717]
This paper formulates the EV charging scheduling problem as a constrained Markov decision process (CMDP)
A novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP.
Comprehensive numerical experiments with real-world electricity price demonstrate that our proposed algorithm can achieve high solution optimality and constraints compliance.
arXiv Detail & Related papers (2022-09-20T14:56:51Z) - Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A
Multi-Agent Deep Reinforcement Learning Approach [82.6692222294594]
We study a risk-aware energy scheduling problem for a microgrid-powered MEC network.
We derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based advantage actor-critic (A3C) algorithm with shared neural networks.
arXiv Detail & Related papers (2020-02-21T02:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.