Embodiment-Induced Coordination Regimes in Tabular Multi-Agent Q-Learning
- URL: http://arxiv.org/abs/2601.17454v1
- Date: Sat, 24 Jan 2026 13:09:01 GMT
- Title: Embodiment-Induced Coordination Regimes in Tabular Multi-Agent Q-Learning
- Authors: Muhammad Ahmed Atif, Nehal Naeem Haji, Mohammad Shahid Shaikh, Muhammad Ebad Atif,
- Abstract summary: We compare independent and centralized Q-learning under explicit embodiment constraints on agent speed and stamina.<n>Results show that increased coordination can become a liability under embodiment constraints.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Centralized value learning is often assumed to improve coordination and stability in multi-agent reinforcement learning, yet this assumption is rarely tested under controlled conditions. We directly evaluate it in a fully tabular predator-prey gridworld by comparing independent and centralized Q-learning under explicit embodiment constraints on agent speed and stamina. Across multiple kinematic regimes and asymmetric agent roles, centralized learning fails to provide a consistent advantage and is frequently outperformed by fully independent learning, even under full observability and exact value estimation. Moreover, asymmetric centralized-independent configurations induce persistent coordination breakdowns rather than transient learning instability. By eliminating confounding effects from function approximation and representation learning, our tabular analysis isolates coordination structure as the primary driver of these effects. The results show that increased coordination can become a liability under embodiment constraints, and that the effectiveness of centralized learning is fundamentally regime and role dependent rather than universal.
Related papers
- Continual uncertainty learning [0.0]
This study formulates a new curriculum-based continual learning framework for robust control problems involving nonlinear dynamical systems.<n>As a practical industrial application, this study applies the proposed method to designing an active vibration controller for automotive powertrains.
arXiv Detail & Related papers (2026-02-19T08:39:42Z) - UACER: An Uncertainty-Aware Critic Ensemble Framework for Robust Adversarial Reinforcement Learning [15.028168889991795]
We propose a novel approach, Uncertainty-Aware Critic Ensemble for robust adversarial Reinforcement learning (UACER)<n>In this paper, we propose a novel approach, Uncertainty-Aware Critic Ensemble for robust adversarial Reinforcement learning (UACER)
arXiv Detail & Related papers (2025-12-11T10:14:13Z) - A Unified and Stable Risk Minimization Framework for Weakly Supervised Learning with Theoretical Guarantees [33.15955234458642]
Weakly supervised learning has emerged as a practical alternative to fully supervised learning when complete and accurate labels are costly or infeasible to acquire.<n>We propose a principled, unified framework that bypasses such post-hoc adjustments by formulating a stable surrogate risk grounded in the structure of weakly supervised data.
arXiv Detail & Related papers (2025-11-28T00:57:04Z) - Adaptive Variance-Penalized Continual Learning with Fisher Regularization [0.0]
This work presents a novel continual learning framework that integrates Fisher-weighted asymmetric regularization of parameter variances.<n>Our method dynamically modulates regularization intensity according to parameter uncertainty, achieving enhanced stability and performance.
arXiv Detail & Related papers (2025-08-15T21:49:28Z) - Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents.<n>In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches.<n>We show that even moderate levels of information sharing significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z) - Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching [0.9963916732353794]
Divergent Ensemble Network (DEN) is a novel architecture that combines shared representation learning with independent branching.<n>DEN employs a shared input layer to capture common features across all branches, followed by divergent, independently trainable layers that form an ensemble.<n>This shared-to-branching structure reduces parameter redundancy while maintaining ensemble diversity, enabling efficient and scalable learning.
arXiv Detail & Related papers (2024-12-02T06:52:45Z) - Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity [51.40558987254471]
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations.
This paper addresses the question of reinforcement learning under $textitgeneral$ latent dynamics from a statistical and algorithmic perspective.
arXiv Detail & Related papers (2024-10-23T14:22:49Z) - Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank [52.831993899183416]
We introduce a structural assumption -- the interaction rank -- and establish that functions with low interaction rank are significantly more robust to distribution shift compared to general ones.
We demonstrate that utilizing function classes with low interaction rank, when combined with regularization and no-regret learning, admits decentralized, computationally and statistically efficient learning in offline MARL.
arXiv Detail & Related papers (2024-10-01T22:16:22Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - Seeing is not Believing: Robust Reinforcement Learning against Spurious
Correlation [57.351098530477124]
We consider one critical type of robustness against spurious correlation, where different portions of the state do not have correlations induced by unobserved confounders.
A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one.
Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge.
arXiv Detail & Related papers (2023-07-15T23:53:37Z) - Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework.<n>We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.<n>We show that our networked approach has significant advantages over both alternatives in terms of robustness to update failures and to changes in population size.
arXiv Detail & Related papers (2023-06-05T10:45:39Z) - Adversarially Robust Stability Certificates can be Sample-Efficient [14.658040519472646]
We consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems.
We show that the statistical cost of learning an adversarial stability certificate is equivalent, up to constant factors, to that of learning a nominal stability certificate.
arXiv Detail & Related papers (2021-12-20T17:23:31Z) - Finite-Time Consensus Learning for Decentralized Optimization with
Nonlinear Gossiping [77.53019031244908]
We present a novel decentralized learning framework based on nonlinear gossiping (NGO), that enjoys an appealing finite-time consensus property to achieve better synchronization.
Our analysis on how communication delay and randomized chats affect learning further enables the derivation of practical variants.
arXiv Detail & Related papers (2021-11-04T15:36:25Z) - Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions.
We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel.
We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.