Related papers: Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning

Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2508.01522v1
Date: Sat, 02 Aug 2025 23:52:33 GMT
Title: Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning
Authors: Jack Zeng, Andreu Matoses Gimenez, Eugene Vinitsky, Javier Alonso-Mora, Sihao Sun,
Abstract summary: This paper presents the first decentralized method to enable real-world 6-DoF manipulation of a cable-suspended load using a team of Micro-Aerial Vehicles (MAVs)<n>Our method leverages multi-agent reinforcement learning (MARL) to train an outer-loop control policy for each MAV.<n>We validate our method in various real-world experiments, including full-pose control under load model uncertainties.
Score: 16.195474619148793
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents the first decentralized method to enable real-world 6-DoF manipulation of a cable-suspended load using a team of Micro-Aerial Vehicles (MAVs). Our method leverages multi-agent reinforcement learning (MARL) to train an outer-loop control policy for each MAV. Unlike state-of-the-art controllers that utilize a centralized scheme, our policy does not require global states, inter-MAV communications, nor neighboring MAV information. Instead, agents communicate implicitly through load pose observations alone, which enables high scalability and flexibility. It also significantly reduces computing costs during inference time, enabling onboard deployment of the policy. In addition, we introduce a new action space design for the MAVs using linear acceleration and body rates. This choice, combined with a robust low-level controller, enables reliable sim-to-real transfer despite significant uncertainties caused by cable tension during dynamic 3D motion. We validate our method in various real-world experiments, including full-pose control under load model uncertainties, showing setpoint tracking performance comparable to the state-of-the-art centralized method. We also demonstrate cooperation amongst agents with heterogeneous control policies, and robustness to the complete in-flight loss of one MAV. Videos of experiments: https://autonomousrobots.nl/paper_websites/aerial-manipulation-marl

Related papers

Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z)
LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z)
A comparison of RL-based and PID controllers for 6-DOF swimming robots: hybrid underwater object tracking [8.362739554991073]
We present an exploration and assessment of employing a centralized deep Q-network (DQN) controller as a substitute for PID controllers. Our primary focus centers on illustrating this transition with the specific case of underwater object tracking. Our experiments, conducted within a Unity-based simulator, validate the effectiveness of a centralized RL agent over separated PID controllers.
arXiv Detail & Related papers (2024-01-29T23:14:15Z)
Actuator Trajectory Planning for UAVs with Overhead Manipulator using Reinforcement Learning [0.3222802562733786]
We develop a UAV equipped with a controllable arm with two degrees of freedom to carry out actuation tasks on the fly. Our solution is based on employing a Q-learning method to control the trajectory of the tip of the arm, also called end-effector. Our method achieves 92% accuracy in terms of average displacement error using Q-learning with 15,000 episodes.
arXiv Detail & Related papers (2023-08-24T15:06:23Z)
In-Distribution Barrier Functions: Self-Supervised Policy Filters that Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z)
Policy Search for Model Predictive Control with Application to Agile Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC. Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies. Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z)
A Modular and Transferable Reinforcement Learning Framework for the Fleet Rebalancing Problem [2.299872239734834]
We propose a modular framework for fleet rebalancing based on model-free reinforcement learning (RL) We formulate RL state and action spaces as distributions over a grid of the operating area, making the framework scalable. Numerical experiments, using real-world trip and network data, demonstrate that this approach has several distinct advantages over baseline methods.
arXiv Detail & Related papers (2021-05-27T16:32:28Z)
Collision-Free Flocking with a Dynamic Squad of Fixed-Wing UAVs Using Deep Reinforcement Learning [2.555094847583209]
We deal with the decentralized leader-follower flocking control problem through deep reinforcement learning (DRL) We propose a novel reinforcement learning algorithm CACER-II for training a shared control policy for all the followers. As a result, the variable-length system state can be encoded into a fixed-length embedding vector, which makes the learned DRL policies independent with the number or the order of followers.
arXiv Detail & Related papers (2021-01-20T11:23:35Z)
Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL [63.52264764099532]
We study the ability of autonomous vehicles to improve the throughput of a bottleneck using a fully decentralized control scheme in a mixed autonomy setting. We apply multi-agent reinforcement algorithms to this problem and demonstrate that significant improvements in bottleneck throughput, from 20% at a 5% penetration rate to 33% at a 40% penetration rate, can be achieved.
arXiv Detail & Related papers (2020-10-30T22:06:05Z)
Leveraging the Capabilities of Connected and Autonomous Vehicles and Multi-Agent Reinforcement Learning to Mitigate Highway Bottleneck Congestion [2.0010674945048468]
We present an RL-based multi-agent CAV control model to operate in mixed traffic. The results suggest that even at CAV percent share of corridor traffic as low as 10%, CAVs can significantly mitigate bottlenecks in highway traffic.
arXiv Detail & Related papers (2020-10-12T03:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.