Related papers: Accelerating Grasp Exploration by Leveraging Learned Priors

Accelerating Grasp Exploration by Leveraging Learned Priors

URL: http://arxiv.org/abs/2011.05661v1
Date: Wed, 11 Nov 2020 09:42:56 GMT
Title: Accelerating Grasp Exploration by Leveraging Learned Priors
Authors: Han Yu Li, Michael Danielczuk, Ashwin Balakrishna, Vishal Satish, Ken Goldberg
Abstract summary: The ability of robots to grasp novel objects has industry applications in e-commerce order fulfillment and home service. We present a Thompson sampling algorithm that learns to grasp a given object with unknown geometry using online experience.
Score: 24.94895421569869
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability of robots to grasp novel objects has industry applications in e-commerce order fulfillment and home service. Data-driven grasping policies have achieved success in learning general strategies for grasping arbitrary objects. However, these approaches can fail to grasp objects which have complex geometry or are significantly outside of the training distribution. We present a Thompson sampling algorithm that learns to grasp a given object with unknown geometry using online experience. The algorithm leverages learned priors from the Dexterity Network robot grasp planner to guide grasp exploration and provide probabilistic estimates of grasp success for each stable pose of the novel object. We find that seeding the policy with the Dex-Net prior allows it to more efficiently find robust grasps on these objects. Experiments suggest that the best learned policy attains an average total reward 64.5% higher than a greedy baseline and achieves within 5.7% of an oracle baseline when evaluated over 300,000 training runs across a set of 3000 object poses.

Related papers

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning [2.5352713493505785]
Reinforcement learning -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.
arXiv Detail & Related papers (2025-04-02T08:15:16Z)
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration [54.8229698058649]
We study how unlabeled prior trajectory data can be leveraged to learn efficient exploration strategies. Our method SUPE (Skills from Unlabeled Prior data for Exploration) demonstrates that a careful combination of these ideas compounds their benefits. We empirically show that SUPE reliably outperforms prior strategies, successfully solving a suite of long-horizon, sparse-reward tasks.
arXiv Detail & Related papers (2024-10-23T17:58:45Z)
Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z)
Learning active tactile perception through belief-space control [21.708391958446274]
We propose a method that autonomously learns tactile exploration policies by developing a generative world model. We evaluate our method on three simulated tasks where the goal is to estimate a desired object property. We find that our method is able to discover policies that efficiently gather information about the desired property in an intuitive manner.
arXiv Detail & Related papers (2023-11-30T21:54:42Z)
Probable Object Location (POLo) Score Estimation for Efficient Object Goal Navigation [15.623723522165731]
We introduce a novel framework centered around the Probable Object Location (POLo) score. We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score. Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods.
arXiv Detail & Related papers (2023-11-14T08:45:32Z)
GraspCaps: A Capsule Network Approach for Familiar 6DoF Object Grasping [6.72184534513047]
The paper presents GraspCaps, a novel architecture for generating per-point 6D grasp configurations for familiar objects. In addition, the paper also presents a method for generating a large object-grasping dataset using simulated annealing. The experimental results showed that the overall object-grasping performance of the proposed approach is significantly better than the selected baseline.
arXiv Detail & Related papers (2022-10-07T15:32:34Z)
Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning [108.08083976908195]
We show that policies learned by existing reinforcement learning algorithms can in fact be generalist. We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects. Interestingly, we find that multi-task learning with object point cloud representations not only generalizes better but even outperforms single-object specialist policies.
arXiv Detail & Related papers (2021-11-04T17:59:56Z)
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation. We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states. E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z)
Exploratory Grasping: Asymptotically Optimal Algorithms for Grasping Challenging Polyhedral Objects [31.82394962213321]
We propose a novel problem setting, Exploratory Grasping, for efficiently discovering reliable grasps on an unknown polyhedral object. We present an efficient bandit-style algorithm, Bandits for Online Rapid Grasp Exploration Strategy (BORGES) BORGES can significantly outperform both general-purpose grasping pipelines and two other online learning algorithms.
arXiv Detail & Related papers (2020-11-11T08:42:30Z)
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming. We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task. We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z)
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals [8.98526174345299]
This paper introduces a notion of imaginary object goals. For a given manipulation task, the object of interest is first trained to reach a desired target position on its own. The object policy is then leveraged to build a predictive model of plausible object trajectories. The proposed algorithm, Follow the Object, has been evaluated on 7 MuJoCo environments.
arXiv Detail & Related papers (2020-08-05T12:19:14Z)
PackIt: A Virtual Environment for Geometric Planning [68.79816936618454]
PackIt is a virtual environment to evaluate and potentially learn the ability to do geometric planning. We construct a set of challenging packing tasks using an evolutionary algorithm.
arXiv Detail & Related papers (2020-07-21T22:51:17Z)
Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules. This paper introduces a new meta-learning approach that discovers an entire update rule. It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.