Learning Bimanual Manipulation via Action Chunking and Inter-Arm Coordination with Transformers
- URL: http://arxiv.org/abs/2503.13916v1
- Date: Tue, 18 Mar 2025 05:20:34 GMT
- Title: Learning Bimanual Manipulation via Action Chunking and Inter-Arm Coordination with Transformers
- Authors: Tomohiro Motoda, Ryo Hanai, Ryoichi Nakajo, Masaki Murooka, Floris Erich, Yukiyasu Domae,
- Abstract summary: We focus on coordination and efficiency between both arms, particularly synchronized actions.<n>We propose a novel imitation learning architecture that predicts cooperative actions.<n>Our model demonstrated a high success rate for comparison and suggested a suitable architecture for the policy learning of bimanual manipulation.
- Score: 4.119006369973485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robots that can operate autonomously in a human living environment are necessary to have the ability to handle various tasks flexibly. One crucial element is coordinated bimanual movements that enable functions that are difficult to perform with one hand alone. In recent years, learning-based models that focus on the possibilities of bimanual movements have been proposed. However, the high degree of freedom of the robot makes it challenging to reason about control, and the left and right robot arms need to adjust their actions depending on the situation, making it difficult to realize more dexterous tasks. To address the issue, we focus on coordination and efficiency between both arms, particularly for synchronized actions. Therefore, we propose a novel imitation learning architecture that predicts cooperative actions. We differentiate the architecture for both arms and add an intermediate encoder layer, Inter-Arm Coordinated transformer Encoder (IACE), that facilitates synchronization and temporal alignment to ensure smooth and coordinated actions. To verify the effectiveness of our architectures, we perform distinctive bimanual tasks. The experimental results showed that our model demonstrated a high success rate for comparison and suggested a suitable architecture for the policy learning of bimanual manipulation.
Related papers
- Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.
We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.
We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z) - Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework [28.193344739081798]
Bimanual robotic manipulation is an emerging and critical topic in the robotics community.<n>We propose a novel decoupled interaction framework that considers the characteristics of different tasks in bimanual manipulation.<n>Our framework achieves outstanding performance, with a 23.5% boost over the SOTA method.
arXiv Detail & Related papers (2025-03-12T09:28:41Z) - You Only Teach Once: Learn One-Shot Bimanual Robotic Manipulation from Video Demonstrations [38.835807227433335]
Bimanual robotic manipulation is a long-standing challenge of embodied intelligence.<n>We propose YOTO, which can extract and then inject patterns of bimanual actions from as few as a single binocular observation.<n>YOTO achieves impressive performance in mimicking 5 intricate long-horizon bimanual tasks.
arXiv Detail & Related papers (2025-01-24T03:26:41Z) - DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation [78.60543357822957]
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics.
We introduce DexHandDiff, an interaction-aware diffusion planning framework for adaptive dexterous manipulation.
Our framework achieves an average of 70.7% success rate on goal adaptive dexterous tasks, highlighting its robustness and flexibility in contact-rich manipulation.
arXiv Detail & Related papers (2024-11-27T18:03:26Z) - HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation [7.01404330241523]
HYPERmotion is a framework that learns, selects and plans behaviors based on tasks in different scenarios.
We combine reinforcement learning with whole-body optimization to generate motion for 38 actuated joints.
Experiments in simulation and real-world show that learned motions can efficiently adapt to new tasks.
arXiv Detail & Related papers (2024-06-20T18:21:24Z) - Hierarchical Procedural Framework for Low-latency Robot-Assisted Hand-Object Interaction [45.256762954338704]
We propose a hierarchical procedural framework to enable robot-assisted hand-object interaction.
A low-level coordination hierarchy fine-tunes the robot's action by using the continuously updated 3D hand models.
A case study of ring-wearing tasks indicates the potential application of this work in assistive technologies such as healthcare and manufacturing.
arXiv Detail & Related papers (2024-05-29T21:20:16Z) - BiRP: Learning Robot Generalized Bimanual Coordination using Relative
Parameterization Method on Human Demonstration [2.301921384458527]
We divide the main bimanual tasks in human daily activities into two types: leader-follower and synergistic coordination.
We propose a relative parameterization method to learn these types of coordination from human demonstration.
We believe that this easy-to-use bimanual learning demonstration (LfD) method has the potential to be used as a data plugin for robot large manipulation model training.
arXiv Detail & Related papers (2023-07-12T05:58:59Z) - "No, to the Right" -- Online Language Corrections for Robotic
Manipulation via Shared Autonomy [70.45420918526926]
We present LILAC, a framework for incorporating and adapting to natural language corrections online during execution.
Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot.
We show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users.
arXiv Detail & Related papers (2023-01-06T15:03:27Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of
Semi-fluid Objects [13.847796949856457]
This letter describes an approach to achieve well-known Chinese cooking art stir-fry on a bimanual robot system.
We define a canonical stir-fry movement, then propose a decoupled framework for learning deformable object manipulation from human demonstration.
By adding visual feedback, our framework can adjust the movements automatically to achieve the desired stir-fry effect.
arXiv Detail & Related papers (2022-05-12T08:58:30Z) - Synthesis and Execution of Communicative Robotic Movements with
Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects.
We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics.
We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z) - Learning Agile Robotic Locomotion Skills by Imitating Animals [72.36395376558984]
Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics.
We present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals.
arXiv Detail & Related papers (2020-04-02T02:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.