Visionary: Vision architecture discovery for robot learning
- URL: http://arxiv.org/abs/2103.14633v1
- Date: Fri, 26 Mar 2021 17:51:43 GMT
- Title: Visionary: Vision architecture discovery for robot learning
- Authors: Iretiayo Akinola, Anelia Angelova, Yao Lu, Yevgen Chebotar, Dmitry
Kalashnikov, Jacob Varley, Julian Ibarz, Michael S. Ryoo
- Abstract summary: We propose a vision-based architecture search algorithm for robot manipulation learning, which discovers interactions between low dimension action inputs and high dimensional visual inputs.
Our approach automatically designs architectures while training on the task - discovering novel ways of combining and attending image feature representations with actions as well as features from previous layers.
- Score: 58.67846907923373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a vision-based architecture search algorithm for robot
manipulation learning, which discovers interactions between low dimension
action inputs and high dimensional visual inputs. Our approach automatically
designs architectures while training on the task - discovering novel ways of
combining and attending image feature representations with actions as well as
features from previous layers. The obtained new architectures demonstrate
better task success rates, in some cases with a large margin, compared to a
recent high performing baseline. Our real robot experiments also confirm that
it improves grasping performance by 6%. This is the first approach to
demonstrate a successful neural architecture search and attention connectivity
search for a real-robot task.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - Enhancing Robot Learning through Learned Human-Attention Feature Maps [6.724036710994883]
We think that embedding auxiliary information about focus point into robot learning would enhance efficiency and robustness of the learning process.
In this paper, we propose a novel approach to model and emulate the human attention with an approximate prediction model.
We test our approach on two learning tasks - object detection and imitation learning.
arXiv Detail & Related papers (2023-08-29T14:23:44Z) - Exploring Visual Pre-training for Robot Manipulation: Datasets, Models
and Methods [14.780597545674157]
We investigate the effects of visual pre-training strategies on robot manipulation tasks from three fundamental perspectives.
We propose a visual pre-training scheme for robot manipulation termed Vi-PRoM, which combines self-supervised learning and supervised learning.
arXiv Detail & Related papers (2023-08-07T14:24:52Z) - Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for
Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge [80.88063189896718]
High architectural and computational complexity can result in poor suitability for deployment on embedded devices.
Fast GraspNeXt is a fast self-attention neural network architecture tailored for embedded multi-task learning in computer vision tasks for robotic grasping.
arXiv Detail & Related papers (2023-04-21T18:07:14Z) - Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic
Grasping [10.424363966870775]
We develop a Physics-Guided Deep Reinforcement Learning with a Hierarchical Reward Mechanism to improve learning efficiency and generalizability for learning-based autonomous grasping.
Our method is validated in robotic grasping tasks with a 3-finger MICO robot arm.
arXiv Detail & Related papers (2022-05-26T18:01:56Z) - Graph Neural Networks for Relational Inductive Bias in Vision-based Deep
Reinforcement Learning of Robot Control [0.0]
This work introduces a neural network architecture that combines relational inductive bias and visual feedback to learn an efficient position control policy.
We derive a graph representation that models the robot's internal state with a low-dimensional description of the visual scene generated by an image encoding network.
We show the ability of the model to improve sample efficiency for a 6-DoF robot arm in a visually realistic 3D environment.
arXiv Detail & Related papers (2022-03-11T15:11:54Z) - Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z) - NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior.
We propose to search for neural architectures that capture stronger image priors.
We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z) - SAPIEN: A SimulAted Part-based Interactive ENvironment [77.4739790629284]
SAPIEN is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects.
We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks.
arXiv Detail & Related papers (2020-03-19T00:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.