Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR
- URL: http://arxiv.org/abs/2504.20927v1
- Date: Tue, 29 Apr 2025 16:42:13 GMT
- Title: Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR
- Authors: Shahbaz P Qadri Syed, He Bai,
- Abstract summary: We exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent.<n>We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent.
- Score: 3.4760283855855336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent. We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent. We establish that the worst-case sample complexity of the decomposition is equal to the centralized case and derive necessary and sufficient graphical conditions on the inter-agent couplings to achieve better sample efficiency. We demonstrate the improved sample efficiency and computational efficiency on numerical examples.
Related papers
- A Multiagent Path Search Algorithm for Large-Scale Coalition Structure Generation [61.08720171136229]
Coalition structure generation is a fundamental computational problem in multiagent systems.<n>We develop SALDAE, a multiagent path finding algorithm for CSG that operates on a graph of coalition structures.
arXiv Detail & Related papers (2025-02-14T15:21:27Z) - Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.<n>We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.<n>Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z) - Scalable Decentralized Algorithms for Online Personalized Mean Estimation [12.002609934938224]
This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean.<n>We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach.
arXiv Detail & Related papers (2024-02-20T08:30:46Z) - Partially Observable Multi-Agent Reinforcement Learning with Information Sharing [33.145861021414184]
We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable games (POSGs)
We advocate leveraging the potential emph information-sharing among agents, a common practice in empirical multi-agent RL, and a standard model for multi-agent control systems with communications.
arXiv Detail & Related papers (2023-08-16T23:42:03Z) - On Collaboration in Distributed Parameter Estimation with Resource Constraints [11.998903619502443]
Sensors or agents must optimize their resource allocation to maximize the accuracy of parameter estimation.
We formulate a sensor or agent's data collection and collaboration policy design problem.
We propose novel approaches that apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy.
arXiv Detail & Related papers (2023-07-12T20:11:50Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement [80.94378602238432]
We propose an efficient structure named Correspondence Efficient Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner.
To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates.
Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
arXiv Detail & Related papers (2022-09-25T13:05:33Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Multi-Agent Determinantal Q-Learning [39.79718674655209]
We propose multi-agent determinantal Q-learning. Q-DPP promotes agents to acquire diverse behavioral models.
We demonstrate that Q-DPP generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks.
arXiv Detail & Related papers (2020-06-02T09:32:48Z) - Task-Based Information Compression for Multi-Agent Communication
Problems with Channel Rate Constraints [28.727611928919725]
We introduce the state-aggregation for information compression algorithm (SAIC) to solve the formulated TBIC problem.
It is shown that SAIC is able to achieve near-optimal performance in terms of the achieved sum of discounted rewards.
arXiv Detail & Related papers (2020-05-28T18:29:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.