Optimizing Electric Bus Charging Scheduling with Uncertainties Using Hierarchical Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2505.10296v1
- Date: Thu, 15 May 2025 13:44:27 GMT
- Title: Optimizing Electric Bus Charging Scheduling with Uncertainties Using Hierarchical Deep Reinforcement Learning
- Authors: Jiaju Qi, Lei Lei, Thorsteinn Jonsson, Dusit Niyato,
- Abstract summary: Electric Buses (EBs) represent a significant step toward sustainable development.<n>By utilizing Internet of Things (IoT) systems, charging stations can autonomously determine charging schedules based on real-time data.<n>However, optimizing EB charging schedules remains a critical challenge due to uncertainties in travel time, energy consumption, and fluctuating electricity prices.
- Score: 46.15490780173541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing adoption of Electric Buses (EBs) represents a significant step toward sustainable development. By utilizing Internet of Things (IoT) systems, charging stations can autonomously determine charging schedules based on real-time data. However, optimizing EB charging schedules remains a critical challenge due to uncertainties in travel time, energy consumption, and fluctuating electricity prices. Moreover, to address real-world complexities, charging policies must make decisions efficiently across multiple time scales and remain scalable for large EB fleets. In this paper, we propose a Hierarchical Deep Reinforcement Learning (HDRL) approach that reformulates the original Markov Decision Process (MDP) into two augmented MDPs. To solve these MDPs and enable multi-timescale decision-making, we introduce a novel HDRL algorithm, namely Double Actor-Critic Multi-Agent Proximal Policy Optimization Enhancement (DAC-MAPPO-E). Scalability challenges of the Double Actor-Critic (DAC) algorithm for large-scale EB fleets are addressed through enhancements at both decision levels. At the high level, we redesign the decentralized actor network and integrate an attention mechanism to extract relevant global state information for each EB, decreasing the size of neural networks. At the low level, the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm is incorporated into the DAC framework, enabling decentralized and coordinated charging power decisions, reducing computational complexity and enhancing convergence speed. Extensive experiments with real-world data demonstrate the superior performance and scalability of DAC-MAPPO-E in optimizing EB fleet charging schedules.
Related papers
- Electric Bus Charging Schedules Relying on Real Data-Driven Targets Based on Hierarchical Deep Reinforcement Learning [46.15490780173541]
The charging scheduling problem of Electric Buses (EBs) is investigated based on Deep Reinforcement Learning (DRL)<n>A high-level agent learns an effective policy for prescribing the charging targets for every charging period, while the low-level agent learns an optimal policy for setting the charging power of every time step within a single charging period.<n>It is proved that the flat policy constructed by superimposing the optimal high-level policy and the optimal low-level policy performs as well as the optimal policy of the original MDP.
arXiv Detail & Related papers (2025-05-15T13:13:41Z) - Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach [49.75068823009836]
6G Internet of Things (IoT) networks face challenges in remote areas and disaster scenarios where ground infrastructure is unavailable.<n>This paper proposes a novel aerial unmanned vehicle (UAV)-assisted computing (MEC) system enhanced by directional antennas to provide both computational and energy support for ground edge terminals.
arXiv Detail & Related papers (2025-05-06T06:46:19Z) - Multi-agent reinforcement learning strategy to maximize the lifetime of Wireless Rechargeable [0.32634122554913997]
The thesis proposes a generalized charging framework for multiple mobile chargers to maximize the network lifetime.
A multi-point charging model is leveraged to enhance charging efficiency, where the MC can charge multiple sensors simultaneously at each charging location.
The proposal allows reinforcement algorithms to be applied to different networks without requiring extensive retraining.
arXiv Detail & Related papers (2024-11-21T02:18:34Z) - DClEVerNet: Deep Combinatorial Learning for Efficient EV Charging
Scheduling in Large-scale Networked Facilities [5.78463306498655]
Electric vehicles (EVs) might stress distribution networks significantly, leaving their performance degraded and jeopardized stability.
Modern power grids require coordinated or smart'' charging strategies capable of optimizing EV charging scheduling in a scalable and efficient fashion.
We formulate a time-coupled binary optimization problem that maximizes EV users' total welfare gain while accounting for the network's available power capacity and stations' occupancy limits.
arXiv Detail & Related papers (2023-05-18T14:03:47Z) - Optimal Scheduling in IoT-Driven Smart Isolated Microgrids Based on Deep
Reinforcement Learning [10.924928763380624]
We investigate the scheduling issue of diesel generators (DGs) in an Internet of Things-Driven microgrid (MG) by deep reinforcement learning (DRL)
The DRL agent learns an optimal policy from history renewable and load data of previous days.
The goal is to reduce operating cost on the premise of ensuring supply-demand balance.
arXiv Detail & Related papers (2023-04-28T23:52:50Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Deep Reinforcement Learning Based Multidimensional Resource Management
for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency.
In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z) - Learning Centric Power Allocation for Edge Intelligence [84.16832516799289]
Edge intelligence has been proposed, which collects distributed data and performs machine learning at the edge.
This paper proposes a learning centric power allocation (LCPA) method, which allocates radio resources based on an empirical classification error model.
Experimental results show that the proposed LCPA algorithm significantly outperforms other power allocation algorithms.
arXiv Detail & Related papers (2020-07-21T07:02:07Z) - Stacked Auto Encoder Based Deep Reinforcement Learning for Online
Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users.
A deep reinforcement learning (DRL) based solution is proposed, which includes the following components.
A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.