PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in
Power Systems
- URL: http://arxiv.org/abs/2111.05969v1
- Date: Wed, 10 Nov 2021 22:22:07 GMT
- Title: PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in
Power Systems
- Authors: David Biagioni, Xiangyu Zhang, Dylan Wald, Deepthi Vaidhynathan, Rohit
Chintala, Jennifer King, Ahmed S. Zamzam
- Abstract summary: We present the PowerGridworld software package to provide users with a lightweight, modular, and customizable framework for creating power-systems-focused, multi-agent Gym environments.
- Score: 6.782988908306483
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the PowerGridworld software package to provide users with a
lightweight, modular, and customizable framework for creating
power-systems-focused, multi-agent Gym environments that readily integrate with
existing training frameworks for reinforcement learning (RL). Although many
frameworks exist for training multi-agent RL (MARL) policies, none can rapidly
prototype and develop the environments themselves, especially in the context of
heterogeneous (composite, multi-device) power systems where power flow
solutions are required to define grid-level variables and costs. PowerGridworld
is an open-source software package that helps to fill this gap. To highlight
PowerGridworld's key features, we present two case studies and demonstrate
learning MARL policies using both OpenAI's multi-agent deep deterministic
policy gradient (MADDPG) and RLLib's proximal policy optimization (PPO)
algorithms. In both cases, at least some subset of agents incorporates elements
of the power flow solution at each time step as part of their reward (negative
cost) structures.
Related papers
- Heterogeneous Multi-Agent Proximal Policy Optimization for Power Distribution System Restoration [4.46185759083096]
This paper applies a Heterogeneous-Agent Reinforcement Learning framework to enable coordinated restoration across interconnected microgrids.<n>Results demonstrate that incorporating microgrid-level heterogeneity within the HARL framework yields a scalable, stable, and constraint-aware solution for complex PDS restoration.
arXiv Detail & Related papers (2025-11-18T18:23:35Z) - Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs [20.084201133669534]
Multi-agent systems (MAS) and reinforcement learning (RL) are widely used to enhance the agentic capabilities of large language models (LLMs)<n>Applying on-policy RL to MAS remains underexplored and presents unique challenges.<n>We propose AT-GRPO, which includes (i) an agent- and turn-wise grouped RL algorithm tailored to MAS and (ii) a training system that supports both single- and multi-policy regimes.
arXiv Detail & Related papers (2025-10-13T06:55:09Z) - Multi-Agent Tool-Integrated Policy Optimization [67.12841355267678]
Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks.<n>Existing implementations typically rely on a single agent, but they suffer from limited context length and noisy tool responses.<n>No existing methods support effective reinforcement learning post-training of tool-integrated multi-agent frameworks.
arXiv Detail & Related papers (2025-10-06T10:44:04Z) - GEM: A Gym for Agentic LLMs [88.36970707762424]
General Experience Maker ( GEM) is an open-source environment simulator designed for the age of large language models (LLMs)<n> GEM provides a standardized framework for the environment-agent interface, including asynchronous vectorized execution for high throughput.<n>We conduct apple-to-apple benchmarking of PPO, GRPO and REINFORCE in both single- and multi-turn settings using GEM to shed light on the algorithmic designs.
arXiv Detail & Related papers (2025-10-01T15:55:57Z) - Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming [57.44900640134789]
We propose a novel multi-agent-based policy learning framework for MILP solving as a Stackelberg game.<n>Specifically, we formulate the collaboration of cut selection and branching in MILP solving as a Stackelberg game.<n>The jointly learned policy significantly improves the solving performance on both synthetic and large-scale real-world MILP datasets.
arXiv Detail & Related papers (2025-08-05T03:16:04Z) - SafePowerGraph-LLM: Novel Power Grid Graph Embedding and Optimization with Large Language Models [12.312620964361844]
This letter introduces SafePowerGraph-LLM, the first framework explicitly designed for solving Optimal Power Flow problems using Large Language Models (LLM)
A new implementation of in-context learning and fine-tuning protocols for LLMs is introduced, tailored specifically for the OPF problem.
Our study reveals the impact of LLM architecture, size, and fine-tuning and demonstrates our framework's ability to handle realistic grid components and constraints.
arXiv Detail & Related papers (2025-01-13T19:01:58Z) - Augmented Lagrangian-Based Safe Reinforcement Learning Approach for Distribution System Volt/VAR Control [1.1059341532498634]
This paper formulates the Volt- VAR control problem as a constrained Markov decision process (CMDP)
A novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP.
A two-stage strategy is adopted for offline training and online execution, so the accurate distribution system model is no longer needed.
arXiv Detail & Related papers (2024-10-19T19:45:09Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - CommonPower: A Framework for Safe Data-Driven Smart Grid Control [7.133681867718039]
Python tool CommonPower is the first framework for the modeling and simulation of power system management tailored towards machine learning.
CommonPower includes a training pipeline for machine-learning-based forecasters as well as a flexible mechanism for incorporating feedback of safeguards into the learning updates of RL controllers.
arXiv Detail & Related papers (2024-06-05T13:06:52Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z) - Multi-Objective Reinforcement Learning based Multi-Microgrid System
Optimisation Problem [4.338938227238059]
Microgrids with energy storage systems and distributed renewable energy sources play a crucial role in reducing the consumption from traditional power sources and the emission of $CO$.
Connecting multi microgrid to a distribution power grid can facilitate a more robust and reliable operation to increase the security and privacy of the system.
The proposed model consists of three layers, smart grid layer, independent system operator (ISO) layer and power grid layer.
arXiv Detail & Related papers (2021-03-10T23:01:22Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z) - Deep Actor-Critic Learning for Distributed Power Control in Wireless
Mobile Networks [5.930707872313038]
Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization.
We present a distributively executed continuous power control algorithm with the help of deep actor-critic learning.
We integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly.
arXiv Detail & Related papers (2020-09-14T18:29:12Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.