Related papers: Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment

Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment

URL: http://arxiv.org/abs/2602.19223v1
Date: Sun, 22 Feb 2026 15:14:45 GMT
Title: Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment
Authors: Aymen Khouja, Imen Jendoubi, Oumayma Mahjoub, Oussama Mahfoudhi, Claude Formanek, Siddarth Singh, Ruan De Kock,
Abstract summary: Multi-Agent Reinforcement Learning (MARL) is a promising solution to address scalability and coordination concerns.<n>This paper addresses the imperative need for comprehensive and reliable benchmarking of MARL algorithms on energy management tasks.<n>CityLearn is used as a case study environment because it simulates urban energy systems, incorporates multiple storage systems, and utilizes renewable energy sources.
Score: 0.7371081631199642
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The optimization of urban energy systems is crucial for the advancement of sustainable and resilient smart cities, which are becoming increasingly complex with multiple decision-making units. To address scalability and coordination concerns, Multi-Agent Reinforcement Learning (MARL) is a promising solution. This paper addresses the imperative need for comprehensive and reliable benchmarking of MARL algorithms on energy management tasks. CityLearn is used as a case study environment because it realistically simulates urban energy systems, incorporates multiple storage systems, and utilizes renewable energy sources. By doing so, our work sets a new standard for evaluation, conducting a comparative study across multiple key performance indicators (KPIs). This approach illuminates the key strengths and weaknesses of various algorithms, moving beyond traditional KPI averaging which often masks critical insights. Our experiments utilize widely accepted baselines such as Proximal Policy Optimization (PPO) and Soft Actor Critic (SAC), and encompass diverse training schemes including Decentralized Training with Decentralized Execution (DTDE) and Centralized Training with Decentralized Execution (CTDE) approaches and different neural network architectures. Our work also proposes novel KPIs that tackle real world implementation challenges such as individual building contribution and battery storage lifetime. Our findings show that DTDE consistently outperforms CTDE in both average and worst-case performance. Additionally, temporal dependency learning improved control on memory dependent KPIs such as ramping and battery usage, contributing to more sustainable battery operation. Results also reveal robustness to agent or resource removal, highlighting both the resilience and decentralizability of the learned policies.

Related papers

Independent policy gradient-based reinforcement learning for economic and reliable energy management of multi-microgrid systems [2.8374986119002803]
This study investigates an economic and reliable energy management problem in multi-microgrid systems (MMSs) under a distributed scheme.<n>We introduce the mean and variance of the exchange power between the MMS and the main grid as indicators for the economic performance and reliability of the system.<n>We propose a fully distributed independent policy algorithm, with rigorous convergence analysis, for scenarios with known parameters.
arXiv Detail & Related papers (2025-11-26T02:11:22Z)
Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach [50.52139512096988]
6G Internet of Things (IoT) networks face challenges in remote areas and disaster scenarios where ground infrastructure is unavailable.<n>This paper proposes a novel aerial unmanned vehicle (UAV)-assisted computing (MEC) system enhanced by directional antennas to provide both computational and energy support for ground edge terminals.
arXiv Detail & Related papers (2025-05-06T06:46:19Z)
Generalising Battery Control in Net-Zero Buildings via Personalised Federated RL [5.195669033269619]
This work studies the challenge of optimal energy management in building-based microgrids through a collaborative and privacy-preserving framework.<n>We evaluate two common RL algorithms (PPO and TRPO) in different collaborative setups to manage distributed energy resources.<n>Our approach emphasizes reducing energy costs and carbon emissions while ensuring privacy.
arXiv Detail & Related papers (2024-12-30T13:38:31Z)
Deep Reinforcement Learning for Community Battery Scheduling under Uncertainties of Load, PV Generation, and Energy Prices [5.694872363688119]
This paper presents a deep reinforcement learning (RL) strategy to schedule a community battery system in the presence of uncertainties. We position the community battery to play a versatile role, in integrating local PV energy, reducing peak load, and exploiting energy price fluctuations for arbitrage.
arXiv Detail & Related papers (2023-12-04T13:45:17Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [55.08287089554127]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability.<n>We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments.<n>We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
Optimal Planning of Hybrid Energy Storage Systems using Curtailed Renewable Energy through Deep Reinforcement Learning [0.0]
We propose a sophisticated deep reinforcement learning (DRL) methodology with a policy-based algorithm to plan energy storage systems (ESS) A quantitative performance comparison proved that the DRL agent outperforms the scenario-based optimization (SO) algorithm. The corresponding results confirmed that the DRL agent learns the way like what a human expert would do, suggesting reliable application of the proposed methodology.
arXiv Detail & Related papers (2022-12-12T02:24:50Z)
An Energy and Carbon Footprint Analysis of Distributed and Federated Learning [42.37180749113699]
Classical and centralized Artificial Intelligence (AI) methods require moving data from producers (sensors, machines) to energy hungry data centers. Emerging alternatives to mitigate such high energy costs propose to efficiently distribute, or federate, the learning tasks across devices. This paper proposes a novel framework for the analysis of energy and carbon footprints in distributed and federated learning.
arXiv Detail & Related papers (2022-06-21T13:28:49Z)
Deep Reinforcement Learning Based Multidimensional Resource Management for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency. In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z)
Energy-Efficient Multi-Orchestrator Mobile Edge Learning [54.28419430315478]
Mobile Edge Learning (MEL) is a collaborative learning paradigm that features distributed training of Machine Learning (ML) models over edge devices. In MEL, possible coexistence of multiple learning tasks with different datasets may arise. We propose lightweight algorithms that can achieve near-optimal performance and facilitate the trade-offs between energy consumption, accuracy, and solution complexity.
arXiv Detail & Related papers (2021-09-02T07:37:10Z)
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning [83.66080019570461]
We propose two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty. We show that these metrics have higher correlations with normalized task solvability scores than a variety of alternatives. These metrics can also be used for fast and compute-efficient optimizations of key design parameters.
arXiv Detail & Related papers (2021-03-23T17:49:50Z)
Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable Edge Computing Systems [87.4519172058185]
An effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied. A novel multi-agent meta-reinforcement learning (MAMRL) framework is proposed to solve the formulated problem. Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost.
arXiv Detail & Related papers (2020-02-20T04:58:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.