An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid
- URL: http://arxiv.org/abs/2507.18738v1
- Date: Thu, 24 Jul 2025 18:38:51 GMT
- Title: An Explainable Equity-Aware P2P Energy Trading Framework for Socio-Economically Diverse Microgrid
- Authors: Abhijan Theja, Mayukha Pal,
- Abstract summary: This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL)<n>The framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Fair and dynamic energy allocation in community microgrids remains a critical challenge, particularly when serving socio-economically diverse participants. Static optimization and cost-sharing methods often fail to adapt to evolving inequities, leading to participant dissatisfaction and unsustainable cooperation. This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL). At its core, the framework utilizes a bi-level optimization model grounded in Equity-regarding Welfare Maximization (EqWM) principles, which incorporate Rawlsian fairness to prioritize the welfare of the least advantaged participants. We introduce a Proximal Policy Optimization (PPO) agent that dynamically adjusts socio-economic weights in the optimization objective based on observed inequities in cost and renewable energy access. This RL-powered feedback loop enables the system to learn and adapt, continuously striving for a more equitable state. To ensure transparency, Explainable AI (XAI) is used to interpret the benefit allocations derived from a weighted Shapley value. Validated across six realistic scenarios, the framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains. The adaptive RL mechanism further reduces the Gini coefficient over time, showcasing a pathway to truly sustainable and fair energy communities.
Related papers
- VAE-GAN Based Price Manipulation in Coordinated Local Energy Markets [3.498661956610689]
This paper introduces a model for coordinating prosumers with heterogeneous distributed energy resources (DERs) in a local energy market (LEM)<n>The proposed LEM scheme utilizes a data-driven, model-free reinforcement learning approach based on the multi-agent deep deterministic policy gradient (MADDPG)<n>We investigate a price manipulation strategy using a variational auto encoder-generative adversarial network (VAE-GAN) model, which allows utilities to adjust price signals in a way that induces financial losses for the prosumers.
arXiv Detail & Related papers (2025-07-26T07:38:27Z) - DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization [55.06360285372418]
Group Relative Policy Optimization is a reinforcement learning method for large reasoning models (LRMs)<n>In this work, we analyze the GRPO objective under a binary reward setting and reveal an inherent limitation of question-level difficulty bias.<n>We introduce a new Discriminative Constrained Optimization framework for reinforcing LRMs, grounded in the principle of discriminative learning.
arXiv Detail & Related papers (2025-05-18T11:08:32Z) - Integrated Optimization and Game Theory Framework for Fair Cost Allocation in Community Microgrids [0.0]
This paper presents a novel framework integrating multi-objective optimization with cooperative game theory for fair and efficient microgrid operation and cost allocation.<n>Results show peak demand reductions ranging from 7.8% to 62.6%, solar utilization rates reaching 114.8% through effective storage integration, and cooperative gains of up to $1,801.01 per day.
arXiv Detail & Related papers (2025-02-13T04:28:17Z) - Demand Response Optimization MILP Framework for Microgrids with DERs [0.0]
This paper presents a framework for optimizing demand response in a microgrid with solar generation and battery storage systems.<n>The framework incorporates load classification, dynamic price thresholding, and multi-period coordination for optimal DR event scheduling.
arXiv Detail & Related papers (2025-02-12T20:10:51Z) - AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization [45.46582930202524]
$alpha$-DPO is an adaptive preference optimization algorithm for large language models.<n>It balances the policy model and the reference model to achieve personalized reward margins.<n>It consistently outperforms DPO and SimPO across various model settings.
arXiv Detail & Related papers (2024-10-14T04:29:57Z) - Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins.
We employ inverse RL (IRL) to automatically learn reward functions without manual tuning.
We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z) - Evaluation of Prosumer Networks for Peak Load Management in Iran: A Distributed Contextual Stochastic Optimization Approach [0.0]
This paper introduces a novel prosumers network framework aimed at mitigating peak loads in Iran.
A cost-oriented integrated prediction and optimization approach is proposed, empowering prosumers to make informed decisions.
Numerical results highlights that integrating prediction with optimization and implementing a contextual information-sharing network among prosumers significantly reduces peak loads as well as total costs.
arXiv Detail & Related papers (2024-08-31T16:09:38Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Self-Supervised Learning for Large-Scale Preventive Security Constrained DC Optimal Power Flow [20.078717680640214]
Security-Constrained Optimal Power Flow (SCOPF) plays a crucial role in power grid stability but becomes increasingly complex as systems grow.
This paper introduces PDL-SCOPF, a self-supervised end-to-end primal-dual learning framework for producing near-optimal solutions to large-scale SCOPF problems.
arXiv Detail & Related papers (2023-11-29T20:36:35Z) - Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW)
We derive regret bounds for CBO-MW that naturally depend on graph-related quantities.
Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.