Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach
- URL: http://arxiv.org/abs/2602.21715v1
- Date: Wed, 25 Feb 2026 09:22:27 GMT
- Title: Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach
- Authors: Xu Yang, Chenhui Lin, Xiang Ma, Dong Liu, Ran Zheng, Haotian Liu, Wenchuan Wu,
- Abstract summary: The integration of distributed photovoltaics into active distribution networks (ADNs) has exacerbated operational challenges.<n>Existing data-driven approaches have demonstrated effectiveness in the voltage control problem.<n>We propose a hybrid knowledge-data-driven approach that leverages dynamic collaboration between a large language model (LLM) agent and a reinforcement learning (RL) agent.
- Score: 30.16233658525027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing integration of distributed photovoltaics (PVs) into active distribution networks (ADNs) has exacerbated operational challenges, making it imperative to coordinate diverse equipment to mitigate voltage violations and enhance power quality. Although existing data-driven approaches have demonstrated effectiveness in the voltage control problem, they often require extensive trial-and-error exploration and struggle to incorporate heterogeneous information, such as day-ahead forecasts and semantic-based grid codes. Considering the operational scenarios and requirements in real-world ADNs, in this paper, we propose a hybrid knowledge-data-driven approach that leverages dynamic collaboration between a large language model (LLM) agent and a reinforcement learning (RL) agent to achieve two-stage voltage control. In the day-ahead stage, the LLM agent receives coarse region-level forecasts and generates scheduling strategies for on-load tap changer (OLTC) and shunt capacitors (SCs) to regulate the overall voltage profile. Then in the intra-day stage, based on accurate node-level measurements, the RL agent refines terminal voltages by deriving reactive power generation strategies for PV inverters. On top of the LLM-RL collaboration framework, we further propose a self-evolution mechanism for the LLM agent and a pretrain-finetune pipeline for the RL agent, effectively enhancing and coordinating the policies for both agents. The proposed approach not only aligns more closely with practical operational characteristics but also effectively utilizes the inherent knowledge and reasoning capabilities of the LLM agent, significantly improving training efficiency and voltage control performance. Comprehensive comparisons and ablation studies demonstrate the effectiveness of the proposed method.
Related papers
- Heterogeneous Agent Collaborative Reinforcement Learning [52.99813668995983]
Heterogeneous Agent Collaborative Reinforcement Learning (HACRL)<n>Building on this paradigm, we propose HACPO, a collaborative RL algorithm that enables principled rollout sharing to maximize sample utilization and cross-agent knowledge transfer.<n>Experiments across diverse heterogeneous model combinations and reasoning benchmarks show that HACPO consistently improves all participating agents, outperforming GSPO by an average of 3.3% while using only half the rollout cost.
arXiv Detail & Related papers (2026-03-03T05:09:49Z) - Large Language Model-Empowered Decision Transformer for UAV-Enabled Data Collection [71.84636717632206]
Unmanned aerial vehicles (UAVs) for reliable and energy-efficient data collection from spatially distributed devices holds great promise in supporting Internet of Things (IoT) applications.<n>We propose a joint language model (LLM) to learn effective UAV control policies.<n>LLM-CRDT outperforms benchmark online and offline methods, achieving up to 36.7% higher energy efficiency than current state-of-the-art DT approaches.
arXiv Detail & Related papers (2025-09-17T13:05:08Z) - Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z) - LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z) - RL2: Reinforce Large Language Model to Assist Safe Reinforcement Learning for Energy Management of Active Distribution Networks [12.205847538487433]
Large language models (LLMs) provide a promising way to assist safe RL for energy management in ADNs.<n>We propose an RL2 mechanism to refine the generated functions iteratively and adaptively through multi-round dialogues.
arXiv Detail & Related papers (2024-12-02T09:15:36Z) - Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control [34.95810473913879]
We formalize the active voltage control problem as a constrained Markov game and propose a safety-constrained MARL algorithm.
We evaluate our approach in the power distribution network simulation environment with real-world scale scenarios.
arXiv Detail & Related papers (2024-05-14T09:03:00Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Stabilizing Voltage in Power Distribution Networks via Multi-Agent
Reinforcement Learning with Transformer [128.19212716007794]
We propose a Transformer-based Multi-Agent Actor-Critic framework (T-MAAC) to stabilize voltage in power distribution networks.
In addition, we adopt a novel auxiliary-task training process tailored to the voltage control task, which improves the sample efficiency.
arXiv Detail & Related papers (2022-06-08T07:48:42Z) - Scalable Voltage Control using Structure-Driven Hierarchical Deep
Reinforcement Learning [0.0]
This paper presents a novel hierarchical deep reinforcement learning (DRL) based design for the voltage control of power grids.
We exploit the area-wise division structure of the power system to propose a hierarchical DRL design that can be scaled to the larger grid models.
We train area-wise decentralized RL agents to compute lower-level policies for the individual areas, and concurrently train a higher-level DRL agent that uses the updates of the lower-level policies to efficiently coordinate the control actions taken by the lower-level agents.
arXiv Detail & Related papers (2021-01-29T21:30:59Z) - Distributed Voltage Regulation of Active Distribution System Based on
Enhanced Multi-agent Deep Reinforcement Learning [9.7314654861242]
This paper proposes a data-driven distributed voltage control approach based on the spectrum clustering and the enhanced multi-agent deep reinforcement learning (MADRL) algorithm.
The proposed method can significantly reduce the requirements of communications and knowledge of system parameters.
It also effectively deals with uncertainties and can provide online coordinated control based on the latest local information.
arXiv Detail & Related papers (2020-05-31T15:48:27Z) - Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR
Control in Active Distribution Networks [3.260913246106564]
We propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources.
In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch.
In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency.
arXiv Detail & Related papers (2020-05-20T08:02:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.