Canonical mapping as a general-purpose object descriptor for robotic
manipulation
- URL: http://arxiv.org/abs/2303.01331v1
- Date: Thu, 2 Mar 2023 15:09:25 GMT
- Title: Canonical mapping as a general-purpose object descriptor for robotic
manipulation
- Authors: Benjamin Joffe and Konrad Ahlin
- Abstract summary: We propose using canonical mapping as a near-universal and flexible object descriptor.
We demonstrate that common object representations can be derived from a single pre-trained canonical mapping model.
We perform a multi-stage experiment using two robot arms that demonstrate the robustness of the perception approach.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perception is an essential part of robotic manipulation in a semi-structured
environment. Traditional approaches produce a narrow task-specific prediction
(e.g., object's 6D pose), that cannot be adapted to other tasks and is
ill-suited for deformable objects. In this paper, we propose using canonical
mapping as a near-universal and flexible object descriptor. We demonstrate that
common object representations can be derived from a single pre-trained
canonical mapping model, which in turn can be generated with minimal manual
effort using an automated data generation and training pipeline. We perform a
multi-stage experiment using two robot arms that demonstrate the robustness of
the perception approach and the ways it can inform the manipulation strategy,
thus serving as a powerful foundation for general-purpose robotic manipulation.
Related papers
- Hand-Object Interaction Pretraining from Videos [77.92637809322231]
We learn general robot manipulation priors from 3D hand-object interaction trajectories.
We do so by sharing both the human hand and the manipulated object in 3D space and human motions to robot actions.
We empirically demonstrate that finetuning this policy, with both reinforcement learning (RL) and behavior cloning (BC), enables sample-efficient adaptation to downstream tasks and simultaneously improves robustness and generalizability compared to prior approaches.
arXiv Detail & Related papers (2024-09-12T17:59:07Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation [65.46610405509338]
We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation.
Our framework,Track2Act predicts tracks of how points in an image should move in future time-steps based on a goal.
We show that this approach of combining scalably learned track prediction with a residual policy enables diverse generalizable robot manipulation.
arXiv Detail & Related papers (2024-05-02T17:56:55Z) - Learning Reusable Manipulation Strategies [86.07442931141634]
Humans demonstrate an impressive ability to acquire and generalize manipulation "tricks"
We present a framework that enables machines to acquire such manipulation skills through a single demonstration and self-play.
These learned mechanisms and samplers can be seamlessly integrated into standard task and motion planners.
arXiv Detail & Related papers (2023-11-06T17:35:42Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.
Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.
Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Programmatically Grounded, Compositionally Generalizable Robotic
Manipulation [35.12811184353626]
We show that the conventional pretraining-finetuning pipeline for integrating semantic representations entangles the learning of domain-specific action information.
We propose a modular approach to better leverage pretrained models by exploiting the syntactic and semantic structures of language instructions.
Our model successfully disentangles action and perception, translating to improved zero-shot and compositional generalization in a variety of manipulation behaviors.
arXiv Detail & Related papers (2023-04-26T20:56:40Z) - Zero-Shot Robot Manipulation from Passive Human Videos [59.193076151832145]
We develop a framework for extracting agent-agnostic action representations from human videos.
Our framework is based on predicting plausible human hand trajectories.
We deploy the trained model zero-shot for physical robot manipulation tasks.
arXiv Detail & Related papers (2023-02-03T21:39:52Z) - PACT: Perception-Action Causal Transformer for Autoregressive Robotics
Pre-Training [25.50131893785007]
This work introduces a paradigm for pre-training a general purpose representation that can serve as a starting point for multiple tasks on a given robot.
We present the Perception-Action Causal Transformer (PACT), a generative transformer-based architecture that aims to build representations directly from robot data in a self-supervised fashion.
We show that finetuning small task-specific networks on top of the larger pretrained model results in significantly better performance compared to training a single model from scratch for all tasks simultaneously.
arXiv Detail & Related papers (2022-09-22T16:20:17Z) - Manipulation of Articulated Objects using Dual-arm Robots via Answer Set
Programming [10.316694915810947]
The manipulation of articulated objects is of primary importance in Robotics, and can be considered as one of the most complex manipulation tasks.
Traditionally, this problem has been tackled by developing ad-hoc approaches, which lack flexibility and portability.
We present a framework based on Answer Set Programming (ASP) for the automated manipulation of articulated objects in a robot control architecture.
arXiv Detail & Related papers (2020-10-02T18:50:39Z) - Counterfactual Explanation and Causal Inference in Service of Robustness
in Robot Control [15.104159722499366]
We propose an architecture for training generative models of counterfactual conditionals of the form, 'can we modify event A to cause B instead of C?'
In contrast to conventional control design approaches, where robustness is quantified in terms of the ability to reject noise, we explore the space of counterfactuals that might cause a certain requirement to be violated.
arXiv Detail & Related papers (2020-09-18T14:22:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.