Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework
- URL: http://arxiv.org/abs/2503.09186v1
- Date: Wed, 12 Mar 2025 09:28:41 GMT
- Title: Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework
- Authors: Jian-Jian Jiang, Xiao-Ming Wu, Yi-Xiang He, Ling-An Zeng, Yi-Lin Wei, Dandan Zhang, Wei-Shi Zheng,
- Abstract summary: Bimanual robotic manipulation is an emerging and critical topic in the robotics community.<n>We propose a novel decoupled interaction framework that considers the characteristics of different tasks in bimanual manipulation.<n>Our framework achieves outstanding performance, with a 23.5% boost over the SOTA method.
- Score: 28.193344739081798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bimanual robotic manipulation is an emerging and critical topic in the robotics community. Previous works primarily rely on integrated control models that take the perceptions and states of both arms as inputs to directly predict their actions. However, we think bimanual manipulation involves not only coordinated tasks but also various uncoordinated tasks that do not require explicit cooperation during execution, such as grasping objects with the closest hand, which integrated control frameworks ignore to consider due to their enforced cooperation in the early inputs. In this paper, we propose a novel decoupled interaction framework that considers the characteristics of different tasks in bimanual manipulation. The key insight of our framework is to assign an independent model to each arm to enhance the learning of uncoordinated tasks, while introducing a selective interaction module that adaptively learns weights from its own arm to improve the learning of coordinated tasks. Extensive experiments on seven tasks in the RoboTwin dataset demonstrate that: (1) Our framework achieves outstanding performance, with a 23.5% boost over the SOTA method. (2) Our framework is flexible and can be seamlessly integrated into existing methods. (3) Our framework can be effectively extended to multi-agent manipulation tasks, achieving a 28% boost over the integrated control SOTA. (4) The performance boost stems from the decoupled design itself, surpassing the SOTA by 16.5% in success rate with only 1/6 of the model size.
Related papers
- Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.
We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.
We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z) - Learning Bimanual Manipulation via Action Chunking and Inter-Arm Coordination with Transformers [4.119006369973485]
We focus on coordination and efficiency between both arms, particularly synchronized actions.
We propose a novel imitation learning architecture that predicts cooperative actions.
Our model demonstrated a high success rate for comparison and suggested a suitable architecture for the policy learning of bimanual manipulation.
arXiv Detail & Related papers (2025-03-18T05:20:34Z) - An Interpretable Neural Control Network with Adaptable Online Learning for Sample Efficient Robot Locomotion Learning [7.6119527195998]
Sequential Motion Executor (SME) is a three-layer interpretable neural network.<n> Adaptable Gradient-weighting Online Learning (AGOL) algorithm prioritizes the update of the parameters with high relevance score.<n> SME-AGOL requires 40% fewer samples and receives 150% higher final reward/locomotion performance on a simulated hexapod robot.
arXiv Detail & Related papers (2025-01-18T08:37:33Z) - DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation [78.60543357822957]
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics.<n>We introduce DexHandDiff, an interaction-aware diffusion planning framework for adaptive dexterous manipulation.<n>Our framework achieves 70.0% success on 30-degree door opening, 40.0% and 36.7% on pen and block half-side re-orientation respectively, and 46.7% on hammer nail half drive.
arXiv Detail & Related papers (2024-11-27T18:03:26Z) - PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation [68.17081518640934]
We propose a PrIrmitive-driVen waypOinT-aware world model for Robotic manipulation (PIVOT-R)
PIVOT-R consists of a Waypoint-aware World Model (WAWM) and a lightweight action prediction module.
Our PIVOT-R outperforms state-of-the-art open-source models on the SeaWave benchmark, achieving an average relative improvement of 19.45% across four levels of instruction tasks.
arXiv Detail & Related papers (2024-10-14T11:30:18Z) - Dynamic Hand Gesture-Featured Human Motor Adaptation in Tool Delivery
using Voice Recognition [5.13619372598999]
This paper introduces an innovative human-robot collaborative framework.
It seamlessly integrates hand gesture and dynamic movement recognition, voice recognition, and a switchable control adaptation strategy.
Experiment results have demonstrated superior performance in hand gesture recognition.
arXiv Detail & Related papers (2023-09-20T14:51:09Z) - ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action
Understanding [8.923830513183882]
We present the ATTACH dataset, which contains 51.6 hours of assembly with 95.2k annotated fine-grained actions monitored by three cameras.
In the ATTACH dataset, more than 68% of annotations overlap with other annotations, which is many times more than in related datasets.
We report the performance of state-of-the-art methods for action recognition as well as action detection on video and skeleton-sequence inputs.
arXiv Detail & Related papers (2023-04-17T12:31:24Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Learning to Centralize Dual-Arm Assembly [0.6091702876917281]
This work focuses on assembly with humanoid robots by providing a framework for dual-arm peg-in-hole manipulation.
We reduce modeling effort to a minimum by using sparse rewards only.
We demonstrate the effectiveness of the framework on dual-arm peg-in-hole and analyze sample efficiency and success rates for different action spaces.
arXiv Detail & Related papers (2021-10-08T09:59:12Z) - Learning Multi-Arm Manipulation Through Collaborative Teleoperation [63.35924708783826]
Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks.
Many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk.
We present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms.
arXiv Detail & Related papers (2020-12-12T05:43:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.