Related papers: Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR

Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR

URL: http://arxiv.org/abs/2504.20927v1
Date: Tue, 29 Apr 2025 16:42:13 GMT
Title: Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR
Authors: Shahbaz P Qadri Syed, He Bai,
Abstract summary: We exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent.<n>We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent.
Score: 3.4760283855855336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent. We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent. We establish that the worst-case sample complexity of the decomposition is equal to the centralized case and derive necessary and sufficient graphical conditions on the inter-agent couplings to achieve better sample efficiency. We demonstrate the improved sample efficiency and computational efficiency on numerical examples.

Related papers

A Multiagent Path Search Algorithm for Large-Scale Coalition Structure Generation [61.08720171136229]
Coalition structure generation is a fundamental computational problem in multiagent systems.<n>We develop SALDAE, a multiagent path finding algorithm for CSG that operates on a graph of coalition structures.
arXiv Detail & Related papers (2025-02-14T15:21:27Z)
Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.<n>We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.<n>Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z)
Scalable Decentralized Algorithms for Online Personalized Mean Estimation [12.002609934938224]
This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean.<n>We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach.
arXiv Detail & Related papers (2024-02-20T08:30:46Z)
Partially Observable Multi-Agent Reinforcement Learning with Information Sharing [33.145861021414184]
We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable games (POSGs) We advocate leveraging the potential emph information-sharing among agents, a common practice in empirical multi-agent RL, and a standard model for multi-agent control systems with communications.
arXiv Detail & Related papers (2023-08-16T23:42:03Z)
On Collaboration in Distributed Parameter Estimation with Resource Constraints [11.998903619502443]
Sensors or agents must optimize their resource allocation to maximize the accuracy of parameter estimation. We formulate a sensor or agent's data collection and collaboration policy design problem. We propose novel approaches that apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy.
arXiv Detail & Related papers (2023-07-12T20:11:50Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement [80.94378602238432]
We propose an efficient structure named Correspondence Efficient Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner. To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates. Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
arXiv Detail & Related papers (2022-09-25T13:05:33Z)
RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios. RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents. Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z)
Multi-Agent Determinantal Q-Learning [39.79718674655209]
We propose multi-agent determinantal Q-learning. Q-DPP promotes agents to acquire diverse behavioral models. We demonstrate that Q-DPP generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks.
arXiv Detail & Related papers (2020-06-02T09:32:48Z)
Task-Based Information Compression for Multi-Agent Communication Problems with Channel Rate Constraints [28.727611928919725]
We introduce the state-aggregation for information compression algorithm (SAIC) to solve the formulated TBIC problem. It is shown that SAIC is able to achieve near-optimal performance in terms of the achieved sum of discounted rewards.
arXiv Detail & Related papers (2020-05-28T18:29:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.