Related papers: Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations

Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations

URL: http://arxiv.org/abs/2203.03797v1
Date: Tue, 8 Mar 2022 01:36:48 GMT
Title: Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations
Authors: Junchi Liang, Bowen Wen, Kostas Bekris and Abdeslam Boularias
Abstract summary: This paper describes a new neural network-based framework for learning simultaneously low-level policies and high-level policies. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks.
Score: 13.864448233719598
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work aims to learn how to perform complex robot manipulation tasks that are composed of several, consecutively executed low-level sub-tasks, given as input a few visual demonstrations of the tasks performed by a person. The sub-tasks consist of moving the robot's end-effector until it reaches a sub-goal region in the task space, performing an action, and triggering the next sub-task when a pre-condition is met. Most prior work in this domain has been concerned with learning only low-level tasks, such as hitting a ball or reaching an object and grasping it. This paper describes a new neural network-based framework for learning simultaneously low-level policies as well as high-level policies, such as deciding which object to pick next or where to place it relative to other objects in the scene. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations, without any manual annotation or post-processing of the data. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks, and outperforms popular imitation learning algorithms.

Related papers

Few-Shot Vision-Language Action-Incremental Policy Learning [55.07841353049953]
Transformer-based robotic manipulation methods utilize multi-view spatial representations and language instructions to learn robot motion trajectories. Existing methods lack the capability for continuous learning on new tasks with only a few demonstrations. We develop a Task-prOmpt graPh evolutIon poliCy (TOPIC) to address these issues.
arXiv Detail & Related papers (2025-04-22T01:30:47Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Look-Ahead Selective Plasticity for Continual Learning of Visual Tasks [9.82510084910641]
We propose a new mechanism that takes place during task boundaries, i.e., when one task finishes and another starts. We evaluate the proposed methods on benchmark computer vision datasets including CIFAR10 and TinyImagenet.
arXiv Detail & Related papers (2023-11-02T22:00:23Z)
Few-Shot In-Context Imitation Learning via Implicit Graph Alignment [15.215659641228655]
We formulate imitation learning as a conditional alignment problem between graph representations of objects. We show that this conditioning allows for in-context learning, where a robot can perform a task on a set of new objects immediately after the demonstrations.
arXiv Detail & Related papers (2023-10-18T18:26:01Z)
Continual Robot Learning using Self-Supervised Task Inference [19.635428830237842]
We propose a self-supervised task inference approach to continually learn new tasks. We use a behavior-matching self-supervised learning objective to train a novel Task Inference Network (TINet) A multi-task policy is built on top of the TINet and trained with reinforcement learning to optimize performance over tasks.
arXiv Detail & Related papers (2023-09-10T09:32:35Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)
Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation [8.939008609565368]
In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning. We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning.
arXiv Detail & Related papers (2022-05-17T13:15:07Z)
Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations [25.33452947179541]
We show the effectiveness of object-aware representation learning techniques for robotic tasks. Our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object techniques.
arXiv Detail & Related papers (2022-05-12T19:48:11Z)
Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation [55.31301153979621]
We tackle real-world long-horizon robot manipulation tasks through skill discovery. We present a bottom-up approach to learning a library of reusable skills from unsegmented demonstrations. Our method has shown superior performance over state-of-the-art imitation learning methods in multi-stage manipulation tasks.
arXiv Detail & Related papers (2021-09-28T16:18:54Z)
Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings. We develop an algorithm to train the policy iteratively on new data collected by the system. We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
Modeling Long-horizon Tasks as Sequential Interaction Landscapes [75.5824586200507]
We present a deep learning network that learns dependencies and transitions across subtasks solely from a set of demonstration videos. We show that these symbols can be learned and predicted directly from image observations. We evaluate our framework on two long horizon tasks: (1) block stacking of puzzle pieces being executed by humans, and (2) a robot manipulation task involving pick and place of objects and sliding a cabinet door with a 7-DoF robot arm.
arXiv Detail & Related papers (2020-06-08T18:07:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.