FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets
- URL: http://arxiv.org/abs/2506.22708v1
- Date: Sat, 28 Jun 2025 01:17:55 GMT
- Title: FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets
- Authors: Shrenik Jadhav, Birva Sevak, Srijita Das, Akhtar Hussain, Wencong Su, Van-Hai Bui,
- Abstract summary: This paper presents FairMarket-RL, a novel framework that combines Large Language Models (LLMs) with Reinforcement Learning (RL) to enable fairness-aware trading agents.<n>In a simulated P2P microgrid with multiple sellers and buyers, the LLM acts as a real-time fairness critic, evaluating each trading episode using two metrics: Fairness-To-Buyer (FTB) and Fairness-Between-Sellers (FBS)
- Score: 1.7284653203366598
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Peer-to-peer (P2P) trading is increasingly recognized as a key mechanism for decentralized market regulation, yet existing approaches often lack robust frameworks to ensure fairness. This paper presents FairMarket-RL, a novel hybrid framework that combines Large Language Models (LLMs) with Reinforcement Learning (RL) to enable fairness-aware trading agents. In a simulated P2P microgrid with multiple sellers and buyers, the LLM acts as a real-time fairness critic, evaluating each trading episode using two metrics: Fairness-To-Buyer (FTB) and Fairness-Between-Sellers (FBS). These fairness scores are integrated into agent rewards through scheduled {\lambda}-coefficients, forming an adaptive LLM-guided reward shaping loop that replaces brittle, rule-based fairness constraints. Agents are trained using Independent Proximal Policy Optimization (IPPO) and achieve equitable outcomes, fulfilling over 90% of buyer demand, maintaining fair seller margins, and consistently reaching FTB and FBS scores above 0.80. The training process demonstrates that fairness feedback improves convergence, reduces buyer shortfalls, and narrows profit disparities between sellers. With its language-based critic, the framework scales naturally, and its extension to a large power distribution system with household prosumers illustrates its practical applicability. FairMarket-RL thus offers a scalable, equity-driven solution for autonomous trading in decentralized energy systems.
Related papers
- BiFair: A Fairness-aware Training Framework for LLM-enhanced Recommender Systems via Bi-level Optimization [13.187285894531275]
BiFair is a fairness-aware training framework designed to mitigate both prior and training unfairness simultaneously.<n>Extensive experiments on three real-world datasets demonstrate that BiFair significantly mitigates unfairness and outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-07-06T08:39:26Z) - GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning [53.894789613838654]
We introduce SEED-Bench-R1, a benchmark with complex real-world videos requiring balanced perception and reasoning.<n>Using SEED-Bench-R1, we find that standard GRPO, while improving answer accuracy, often reduces logical coherence between reasoning steps and answers, with only a 57.9% consistency rate.<n>We propose GRPO-CARE, a consistency-aware RL framework optimizing both answer correctness and reasoning coherence without explicit supervision.
arXiv Detail & Related papers (2025-06-19T08:49:13Z) - From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium [52.28048367430481]
Multi-agent frameworks can boost the reasoning power of large language models (LLMs), but they typically incur heavy computational costs and lack convergence guarantees.<n>We recast multi-LLM coordination as an incomplete-information game and seek a Bayesian Nash equilibrium (BNE)<n>We introduce Efficient Coordination via Nash Equilibrium (ECON), a hierarchical reinforcement-learning paradigm that marries distributed reasoning with centralized final output.
arXiv Detail & Related papers (2025-06-09T23:49:14Z) - FedFACT: A Provable Framework for Controllable Group-Fairness Calibration in Federated Learning [13.575259448363557]
We propose a controllable group-fairness calibration framework, named FedFACT.<n>FedFACT identifies the Bayes-optimal classifiers under both global and local fairness constraints.<n>Experiments on multiple datasets demonstrate that FedFACT consistently outperforms baselines in balancing accuracy and global-local fairness.
arXiv Detail & Related papers (2025-06-04T09:39:57Z) - The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation [73.16564415490113]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by retrieving relevant document from external knowledge sources.<n>We propose two approaches, FairFT and FairFilter, to mitigate the fairness issues introduced by RAG for small-scale LLMs.
arXiv Detail & Related papers (2025-04-11T10:17:10Z) - Fairness Aware Reinforcement Learning via Proximal Policy Optimization [7.061167083587786]
This paper introduces fairness in Proximal Policy Optimization (PPO) with a penalty term derived from demographic parity, counterfactual fairness, and conditional statistical parity.<n>We evaluate our approach in the Allelopathic Harvest game, a cooperative and competitive MAS focused on resource collection.
arXiv Detail & Related papers (2025-02-06T10:45:55Z) - An Auction-based Marketplace for Model Trading in Federated Learning [54.79736037670377]
Federated learning (FL) is increasingly recognized for its efficacy in training models using locally distributed data.
We frame FL as a marketplace of models, where clients act as both buyers and sellers.
We propose an auction-based solution to ensure proper pricing based on performance gain.
arXiv Detail & Related papers (2024-02-02T07:25:53Z) - Domain-adapted Learning and Imitation: DRL for Power Arbitrage [1.6874375111244329]
We propose a collaborative dual-agent reinforcement learning approach for this bi-level simulation and optimization of European power arbitrage trading.
We introduce two new implementations designed to incorporate domain-specific knowledge by imitating the trading behaviours of power traders.
Our study demonstrates that by leveraging domain expertise in a general learning problem, the performance can be improved substantially.
arXiv Detail & Related papers (2023-01-19T23:36:23Z) - How Robust is Your Fairness? Evaluating and Sustaining Fairness under
Unseen Distribution Shifts [107.72786199113183]
We propose a novel fairness learning method termed CUrvature MAtching (CUMA)
CUMA achieves robust fairness generalizable to unseen domains with unknown distributional shifts.
We evaluate our method on three popular fairness datasets.
arXiv Detail & Related papers (2022-07-04T02:37:50Z) - Proportional Fairness in Federated Learning [27.086313029073683]
PropFair is a novel and easy-to-implement algorithm for finding proportionally fair solutions in federated learning.
We demonstrate that PropFair can approximately find PF solutions, and it achieves a good balance between the average performances of all clients and of the worst 10% clients.
arXiv Detail & Related papers (2022-02-03T16:28:04Z) - Deep Q-Learning Market Makers in a Multi-Agent Simulated Stock Market [58.720142291102135]
This paper focuses precisely on the study of these markets makers strategies from an agent-based perspective.
We propose the application of Reinforcement Learning (RL) for the creation of intelligent market markers in simulated stock markets.
arXiv Detail & Related papers (2021-12-08T14:55:21Z) - Fairness for Cooperative Multi-Agent Learning with Equivariant Policies [24.92668968807012]
We study fairness through the lens of cooperative multi-agent learning.
We introduce team fairness, a group-based fairness measure for multi-agent learning.
We then incorporate team fairness into policy optimization.
arXiv Detail & Related papers (2021-06-10T13:17:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.