Related papers: Evolving LLM-Derived Control Policies for Residential EV Charging and Vehicle-to-Grid Energy Optimization

Evolving LLM-Derived Control Policies for Residential EV Charging and Vehicle-to-Grid Energy Optimization

URL: http://arxiv.org/abs/2602.07275v1
Date: Fri, 06 Feb 2026 23:59:33 GMT
Title: Evolving LLM-Derived Control Policies for Residential EV Charging and Vehicle-to-Grid Energy Optimization
Authors: Vishesh Purnananda, Benjamin John Wruck, Mingyu Guo,
Abstract summary: This research presents a novel application of Evolutionary Computation to the domain of residential electric vehicle (EV) energy management.<n>While reinforcement learning (RL) achieves high performance in vehicle-to-grid (V2G) optimization, it typically produces opaque "black-box" neural networks that are difficult for consumers and regulators to audit.<n>We propose a search framework that leverages Large Language Models (LLMs) as intelligent mutation operators within an iterative prompt-fidelity repair loop.
Score: 7.073682493135313
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This research presents a novel application of Evolutionary Computation to the domain of residential electric vehicle (EV) energy management. While reinforcement learning (RL) achieves high performance in vehicle-to-grid (V2G) optimization, it typically produces opaque "black-box" neural networks that are difficult for consumers and regulators to audit. Addressing this interpretability gap, we propose a program search framework that leverages Large Language Models (LLMs) as intelligent mutation operators within an iterative prompt-evaluation-repair loop. Utilizing the high-fidelity EV2Gym simulation environment as a fitness function, the system undergoes successive refinement cycles to synthesize executable Python policies that balance profit maximization, user comfort, and physical safety constraints. We benchmark four prompting strategies: Imitation, Reasoning, Hybrid and Runtime, evaluating their ability to discover adaptive control logic. Results demonstrate that the Hybrid strategy produces concise, human-readable heuristics that achieve 118% of the baseline profit, effectively discovering complex behaviors like anticipatory arbitrage and hysteresis without explicit programming. This work establishes LLM-driven Evolutionary Computation as a practical approach for generating EV charging control policies that are transparent, inspectable, and suitable for real residential deployment.

Related papers

AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization [61.535567824938205]
We introduce AdaEvolve, a framework that reformulates LLM-driven evolution as a hierarchical adaptive optimization problem.<n>AdaEvolve consistently outperforms the open-ended baselines across 185 different open-ended optimization problems.
arXiv Detail & Related papers (2026-02-23T18:45:31Z)
Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning [88.42566960813438]
CalibRL is a hybrid-policy RLVR framework that supports controllable exploration with expert guidance.<n>CalibRL increases policy entropy in a guided manner and clarifies the target distribution.<n>Experiments across eight benchmarks, including both in-domain and out-of-domain settings, demonstrate consistent improvements.
arXiv Detail & Related papers (2026-02-22T07:23:36Z)
EmboCoach-Bench: Benchmarking AI Agents on Developing Embodied Robots [68.29056647487519]
Embodied AI is fueled by high-fidelity simulation and large-scale data collection.<n>However, this scaling capability remains bottlenecked by a reliance on labor-intensive manual oversight.<n>We introduce textscEmboCoach-Bench, a benchmark evaluating the capacity of LLM agents to autonomously engineer embodied policies.
arXiv Detail & Related papers (2026-01-29T11:33:49Z)
Optimizing Electric Vehicle Charging Station Placement Using Reinforcement Learning and Agent-Based Simulations [5.6037668742884135]
Reinforcement learning offers an innovative approach to identifying optimal charging station locations.<n>We propose a novel framework that integrates deep RL with agent-based simulations to model EV movement and estimate charging demand in real time.<n>Our approach employs a hybrid RL agent with dual Q-networks to select optimal locations and configure charging ports, guided by a hybrid reward function that combines deterministic factors with simulation-derived feedback.
arXiv Detail & Related papers (2025-11-03T04:22:39Z)
Control of Renewable Energy Communities using AI and Real-World Data [0.0]
This paper introduces a framework designed explicitly to handle these complexities and bridge the simulation to-reality gap.<n>It incorporates EnergAIze, a MADD-based multi-agent control strategy, and specifically addresses challenges related to real-world data collection, system integration, and user behavior modeling.
arXiv Detail & Related papers (2025-05-22T22:20:09Z)
Multi-Objective Reinforcement Learning for Energy-Efficient Industrial Control [0.6990493129893112]
Industrial automation increasingly demands energy-efficient control strategies to balance performance with environmental and cost constraints.<n>We present a multi-objective reinforcement learning (MORL) framework for energy-efficient control of the Quanser Aero 2 testbed in its one-degree-of-freedom.<n>Preliminary experiments explore the influence of varying the Energy penalty weight, alpha, on the trade-off between pitch tracking and energy savings.
arXiv Detail & Related papers (2025-05-12T14:28:42Z)
Evolutionary Policy Optimization [47.30139909878251]
On-policy reinforcement learning (RL) algorithms are widely used for their strong performance and training stability, but they struggle to scale with larger batch sizes.<n>We propose Evolutionary Policy Optimization (EPO), a hybrid that combines the scalability and diversity of EAs with the performance and stability of policy gradients.
arXiv Detail & Related papers (2025-03-24T18:08:54Z)
Safety-Aware Reinforcement Learning for Electric Vehicle Charging Station Management in Distribution Network [4.842172685255376]
Electric vehicles (EVs) pose a significant risk to the distribution system operation in the absence of coordination. This paper presents a safety-aware reinforcement learning (RL) algorithm designed to manage EV charging stations. Our proposed algorithm does not rely on explicit penalties for constraint violations, eliminating the need for penalty tuning coefficient.
arXiv Detail & Related papers (2024-03-20T01:57:38Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs) Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z)
Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world. This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z)
Efficient Representation for Electric Vehicle Charging Station Operations using Reinforcement Learning [5.815007821143811]
We develop aggregation schemes that are based on the emergency of EV charging, namely the laxity value. A least-laxity first (LLF) rule is adopted to consider only the total charging power of the EVCS. In addition, we propose an equivalent state aggregation that can guarantee to attain the same optimal policy.
arXiv Detail & Related papers (2021-08-07T00:34:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.