Related papers: ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination

ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination

URL: http://arxiv.org/abs/2507.19151v2
Date: Sat, 02 Aug 2025 01:01:16 GMT
Title: ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Authors: Michael Amir, Guang Yang, Zhan Gao, Keisuke Okumura, Heedo Woo, Amanda Prorok,
Abstract summary: We introduce ReCoDe--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of reinforcement learning.<n>In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus.<n>We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch.
Score: 19.115931862737508
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Constraint-based optimization is a cornerstone of robotics, enabling the design of controllers that reliably encode task and safety requirements such as collision avoidance or formation adherence. However, handcrafted constraints can fail in multi-agent settings that demand complex coordination. We introduce ReCoDe--Reinforcement-based Constraint Design--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of multi-agent reinforcement learning. Rather than discarding expert controllers, ReCoDe improves them by learning additional, dynamic constraints that capture subtler behaviors, for example, by constraining agent movements to prevent congestion in cluttered scenarios. Through local communication, agents collectively constrain their allowed actions to coordinate more effectively under changing conditions. In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus, where we show that it outperforms purely handcrafted controllers, other hybrid approaches, and standard MARL baselines. We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch, especially because ReCoDe can dynamically change the degree to which it relies on this controller.

Related papers

Control-Optimized Deep Reinforcement Learning for Artificially Intelligent Autonomous Systems [8.766411351797885]
Deep reinforcement learning (DRL) has become a powerful tool for complex decision-making in machine learning and AI.<n>Traditional methods often assume perfect action execution, overlooking the uncertainties and deviations between an agent's selected actions and the actual system response.<n>This work advances AI by developing a novel control-optimized DRL framework that explicitly models and compensates for action execution mismatches.
arXiv Detail & Related papers (2025-06-30T21:25:52Z)
Intelligent Sensing-to-Action for Robust Autonomy at the Edge: Opportunities and Challenges [19.390215975410406]
Autonomous edge computing in robotics, smart cities, and autonomous vehicles relies on seamless integration of sensing, processing, and actuation.<n>At its core is the sensing-to-action loop, which iteratively aligns sensor inputs with computational models to drive adaptive control strategies.<n>This article explores how proactive, context-aware sensing-to-action and action-to-sensing adaptations can enhance efficiency.
arXiv Detail & Related papers (2025-02-04T20:13:58Z)
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems [80.30532872347668]
Wireless Networked Control Systems (WNCSs) are essential to Industry 4.0, enabling flexible control in applications, such as drone swarms and autonomous robots. We propose a practical WNCS model that captures correlated dynamics among multiple control loops with spatially distributed sensors and actuators sharing limited wireless resources over multi-state Markov block-fading channels. We develop a Deep Reinforcement Learning (DRL) algorithm that efficiently handles the hybrid action space, captures communication-control correlations, and ensures robust training despite sparse cross-domain variables and floating control inputs.
arXiv Detail & Related papers (2024-10-15T06:28:21Z)
PIP-Loco: A Proprioceptive Infinite Horizon Planning Framework for Quadrupedal Robot Locomotion [1.123472110161393]
A core strength of Model Predictive Control (MPC) for quadrupedal locomotion has been its ability to enforce constraints.<n>We propose a framework that integrates proprioceptive planning with Reinforcement Learning (RL)<n>During deployment, the Dreamer module solves an infinite-horizon MPC problem, adapting actions and velocity commands to respect the constraints.
arXiv Detail & Related papers (2024-09-14T13:51:37Z)
Cooperative Cognitive Dynamic System in UAV Swarms: Reconfigurable Mechanism and Framework [80.39138462246034]
We propose the cooperative cognitive dynamic system (CCDS) to optimize the management for UAV swarms. CCDS is a hierarchical and cooperative control structure that enables real-time data processing and decision. In addition, CCDS can be integrated with the biomimetic mechanism to efficiently allocate tasks for UAV swarms.
arXiv Detail & Related papers (2024-05-18T12:45:00Z)
Optimal Controller Realizations against False Data Injections in Cooperative Driving [2.2134894590368748]
We study a controller-oriented approach to mitigate the effect of a class of False-Data Injection (FDI) attacks. We show that a class of new but equivalent controllers can represent the base controller. We obtain the optimal combination of sensors that minimizes the effect of FDI attacks.
arXiv Detail & Related papers (2024-04-08T09:53:42Z)
Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability. We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z)
Deep Learning for Wireless Networked Systems: a joint Estimation-Control-Scheduling Approach [47.29474858956844]
Wireless networked control system (WNCS) connecting sensors, controllers, and actuators via wireless communications is a key enabling technology for highly scalable and low-cost deployment of control systems in the Industry 4.0 era. Despite the tight interaction of control and communications in WNCSs, most existing works adopt separative design approaches. We propose a novel deep reinforcement learning (DRL)-based algorithm for controller and optimization utilizing both model-free and model-based data.
arXiv Detail & Related papers (2022-10-03T01:29:40Z)
Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation. domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z)
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion [95.1825179206694]
We present a framework that synthesizes robust controllers for a quadruped robot. A high-level controller learns to choose from a set of primitives in response to changes in the environment. A low-level controller that utilizes an established control method to robustly execute the primitives.
arXiv Detail & Related papers (2020-09-21T16:49:26Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.