Learning Dual-Arm Coordination for Grasping Large Flat Objects
- URL: http://arxiv.org/abs/2504.03500v1
- Date: Fri, 04 Apr 2025 14:55:46 GMT
- Title: Learning Dual-Arm Coordination for Grasping Large Flat Objects
- Authors: Yongliang Wang, Hamidreza Kasaei,
- Abstract summary: We propose a model-free deep reinforcement learning framework to enable dual-arm coordination for grasping large flat objects.<n>A CNN-based Proximal Policy Optimization (PPO) algorithm with shared Actor-Critic layers is employed to learn coordinated dual-arm grasp actions.<n> Experimental results demonstrate that our policy can effectively grasp large flat objects without requiring additional maneuvers.
- Score: 14.847692568611087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Grasping large flat objects, such as books or keyboards lying horizontally, presents significant challenges for single-arm robotic systems, often requiring extra actions like pushing objects against walls or moving them to the edge of a surface to facilitate grasping. In contrast, dual-arm manipulation, inspired by human dexterity, offers a more refined solution by directly coordinating both arms to lift and grasp the object without the need for complex repositioning. In this paper, we propose a model-free deep reinforcement learning (DRL) framework to enable dual-arm coordination for grasping large flat objects. We utilize a large-scale grasp pose detection model as a backbone to extract high-dimensional features from input images, which are then used as the state representation in a reinforcement learning (RL) model. A CNN-based Proximal Policy Optimization (PPO) algorithm with shared Actor-Critic layers is employed to learn coordinated dual-arm grasp actions. The system is trained and tested in Isaac Gym and deployed to real robots. Experimental results demonstrate that our policy can effectively grasp large flat objects without requiring additional maneuvers. Furthermore, the policy exhibits strong generalization capabilities, successfully handling unseen objects. Importantly, it can be directly transferred to real robots without fine-tuning, consistently outperforming baseline methods.
Related papers
- FLEX: A Framework for Learning Robot-Agnostic Force-based Skills Involving Sustained Contact Object Manipulation [9.292150395779332]
We propose a novel framework for learning object-centric manipulation policies in force space.<n>Our method simplifies the action space, reduces unnecessary exploration, and decreases simulation overhead.<n>Our evaluations demonstrate that the method significantly outperforms baselines.
arXiv Detail & Related papers (2025-03-17T17:49:47Z) - Dynamic object goal pushing with mobile manipulators through model-free constrained reinforcement learning [9.305146484955296]
We develop a learning-based controller for a mobile manipulator to move an unknown object to a desired position and yaw orientation through a sequence of pushing actions.<n>The proposed controller for the robotic arm and the mobile base motion is trained using a constrained Reinforcement Learning (RL) formulation.<n>The learned policy achieves a success rate of 91.35% in simulation and at least 80% on hardware in challenging scenarios.
arXiv Detail & Related papers (2025-02-03T17:28:35Z) - CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects [1.3299507495084417]
Nonprehensile manipulation is essential for manipulating objects that are too thin, large, or otherwise ungraspable in the wild.
We propose a novel contact-based object representation and pretraining pipeline to tackle this.
arXiv Detail & Related papers (2024-03-16T01:47:53Z) - Twisting Lids Off with Two Hands [82.21668778600414]
We show how policies trained in simulation can be effectively and efficiently transferred to the real world.
Specifically, we consider the problem of twisting lids of various bottle-like objects with two hands.
This is the first sim-to-real RL system that enables such capabilities on bimanual multi-fingered hands.
arXiv Detail & Related papers (2024-03-04T18:59:30Z) - Modular Neural Network Policies for Learning In-Flight Object Catching
with a Robot Hand-Arm System [55.94648383147838]
We present a modular framework designed to enable a robot hand-arm system to learn how to catch flying objects.
Our framework consists of five core modules: (i) an object state estimator that learns object trajectory prediction, (ii) a catching pose quality network that learns to score and rank object poses for catching, (iii) a reaching control policy trained to move the robot hand to pre-catch poses, and (iv) a grasping control policy trained to perform soft catching motions.
We conduct extensive evaluations of our framework in simulation for each module and the integrated system, to demonstrate high success rates of in-flight
arXiv Detail & Related papers (2023-12-21T16:20:12Z) - Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Decoupling Skill Learning from Robotic Control for Generalizable Object
Manipulation [35.34044822433743]
Recent works in robotic manipulation have shown potential for tackling a range of tasks.
We conjecture that this is due to the high-dimensional action space for joint control.
In this paper, we take an alternative approach and separate the task of learning 'what to do' from 'how to do it'
The whole-body robotic kinematic control is optimized to execute the high-dimensional joint motion to reach the goals in the workspace.
arXiv Detail & Related papers (2023-03-07T16:31:13Z) - Policy Pre-training for End-to-end Autonomous Driving via
Self-supervised Geometric Modeling [96.31941517446859]
We propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving.
We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos.
In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
arXiv Detail & Related papers (2023-01-03T08:52:49Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Learning Dexterous Grasping with Object-Centric Visual Affordances [86.49357517864937]
Dexterous robotic hands are appealing for their agility and human-like morphology.
We introduce an approach for learning dexterous grasping.
Our key idea is to embed an object-centric visual affordance model within a deep reinforcement learning loop.
arXiv Detail & Related papers (2020-09-03T04:00:40Z) - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for
Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals.
Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments.
ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.