Related papers: COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping

COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping

URL: http://arxiv.org/abs/2502.08054v2
Date: Fri, 14 Feb 2025 00:15:42 GMT
Title: COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping
Authors: Jun Yamada, Alexander L. Mitchell, Jack Collins, Ingmar Posner,
Abstract summary: Occluded robot grasping is where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions.<n>Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans.<n>We introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies.
Score: 56.907940167333656
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the challenge of occluded robot grasping, i.e. grasping in situations where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions. Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans in these circumstances. State-of-the-art reinforcement learning (RL) methods are unsuitable due to the inherent complexity of the task. In contrast, learning from demonstration requires collecting a significant number of expert demonstrations, which is often infeasible. Instead, inspired by human bimanual manipulation strategies, where two hands coordinate to stabilise and reorient objects, we focus on a bimanual robotic setup to tackle this challenge. In particular, we introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies: a constraint policy trained using self-supervised datasets to generate stabilising poses and a grasping policy trained using RL that reorients and grasps the target object. A key contribution lies in value function-guided policy coordination. Specifically, during RL training for the grasping policy, the constraint policy's output is refined through gradients from a jointly trained value function, improving bimanual coordination and task performance. Lastly, COMBO-Grasp employs teacher-student policy distillation to effectively deploy point cloud-based policies in real-world environments. Empirical evaluations demonstrate that COMBO-Grasp significantly improves task success rates compared to competitive baseline approaches, with successful generalisation to unseen objects in both simulated and real-world environments.

Related papers

Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models [71.34520793462069]
Unsupervised reinforcement learning (RL) aims at pre-training agents that can solve a wide range of downstream tasks in complex environments. We introduce a novel algorithm regularizing unsupervised RL towards imitating trajectories from unlabeled behavior datasets. We demonstrate the effectiveness of this new approach in a challenging humanoid control problem.
arXiv Detail & Related papers (2025-04-15T10:41:11Z)
Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability. We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z)
Coherent Soft Imitation Learning [17.345411907902932]
Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. This work derives an imitation method that captures the strengths of both BC and IRL.
arXiv Detail & Related papers (2023-05-25T21:54:22Z)
CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning [25.84621883831624]
We present CRISP, a novel HRL algorithm that generates a curriculum of achievable subgoals for evolving lower-level primitives. CRISP uses the lower level primitive to periodically perform data relabeling on a handful of expert demonstrations. We demonstrate that CRISP demonstrates impressive generalization in real world scenarios.
arXiv Detail & Related papers (2023-04-07T08:22:50Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment. We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy. Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z)
Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning [131.1852444489217]
This paper presents Object-aware REgularizatiOn (OREO), a technique that regularizes an imitation policy in an object-aware manner. Our main idea is to encourage a policy to uniformly attend to all semantic objects, in order to prevent the policy from exploiting nuisance variables strongly correlated with expert actions.
arXiv Detail & Related papers (2021-10-27T01:56:23Z)
Composable Learning with Sparse Kernel Representations [110.19179439773578]
We present a reinforcement learning algorithm for learning sparse non-parametric controllers in a Reproducing Kernel Hilbert Space. We improve the sample complexity of this approach by imposing a structure of the state-action function through a normalized advantage function. We demonstrate the performance of this algorithm on learning obstacle-avoidance policies in multiple simulations of a robot equipped with a laser scanner while navigating in a 2D environment.
arXiv Detail & Related papers (2021-03-26T13:58:23Z)
PLAS: Latent Action Space for Offline Reinforcement Learning [18.63424441772675]
The goal of offline reinforcement learning is to learn a policy from a fixed dataset, without further interactions with the environment. Existing off-policy algorithms have limited performance on static datasets due to extrapolation errors from out-of-distribution actions. We demonstrate that our method provides competitive performance consistently across various continuous control tasks and different types of datasets.
arXiv Detail & Related papers (2020-11-14T03:38:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.