Seeing All the Angles: Learning Multiview Manipulation Policies for
Contact-Rich Tasks from Demonstrations
- URL: http://arxiv.org/abs/2104.13907v1
- Date: Wed, 28 Apr 2021 17:43:29 GMT
- Title: Seeing All the Angles: Learning Multiview Manipulation Policies for
Contact-Rich Tasks from Demonstrations
- Authors: Trevor Ablett, Yifan Zhai, Jonathan Kelly
- Abstract summary: A successful multiview policy could be deployed on a mobile manipulation platform.
We demonstrate that a multiview policy can be found through imitation learning by collecting data from a variety of viewpoints.
We show that learning from multiview data has little, if any, penalty to performance for a fixed-view task compared to learning with an equivalent amount of fixed-view data.
- Score: 7.51557557629519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned visuomotor policies have shown considerable success as an alternative
to traditional, hand-crafted frameworks for robotic manipulation tasks.
Surprisingly, the extension of these methods to the multiview domain is
relatively unexplored. A successful multiview policy could be deployed on a
mobile manipulation platform, allowing it to complete a task regardless of its
view of the scene. In this work, we demonstrate that a multiview policy can be
found through imitation learning by collecting data from a variety of
viewpoints. We illustrate the general applicability of the method by learning
to complete several challenging multi-stage and contact-rich tasks, from
numerous viewpoints, both in a simulated environment and on a real mobile
manipulation platform. Furthermore, we analyze our policies to determine the
benefits of learning from multiview data compared to learning with data from a
fixed perspective. We show that learning from multiview data has little, if
any, penalty to performance for a fixed-view task compared to learning with an
equivalent amount of fixed-view data. Finally, we examine the visual features
learned by the multiview and fixed-view policies. Our results indicate that
multiview policies implicitly learn to identify spatially correlated features
with a degree of view-invariance.
Related papers
- Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Learning from Semantic Alignment between Unpaired Multiviews for
Egocentric Video Recognition [23.031934558964473]
We propose Semantics-based Unpaired Multiview Learning (SUM-L) to tackle this unpaired multiview learning problem.
Key idea is to build cross-view pseudo-pairs and do view-invariant alignment by leveraging the semantic information of videos.
Our method also outperforms multiple existing view-alignment methods, under the more challenging scenario.
arXiv Detail & Related papers (2023-08-22T15:10:42Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Multi-View Masked World Models for Visual Robotic Manipulation [132.97980128530017]
We train a multi-view masked autoencoder which reconstructs pixels of randomly masked viewpoints.
We demonstrate the effectiveness of our method in a range of scenarios.
We also show that the multi-view masked autoencoder trained with multiple randomized viewpoints enables training a policy with strong viewpoint randomization.
arXiv Detail & Related papers (2023-02-05T15:37:02Z) - Cross-view Graph Contrastive Representation Learning on Partially
Aligned Multi-view Data [52.491074276133325]
Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields.
We propose a new cross-view graph contrastive learning framework, which integrates multi-view information to align data and learn latent representations.
Experiments conducted on several real datasets demonstrate the effectiveness of the proposed method on the clustering and classification tasks.
arXiv Detail & Related papers (2022-11-08T09:19:32Z) - Multi-View representation learning in Multi-Task Scene [4.509968166110557]
We propose a novel semi-supervised algorithm, termed as Multi-Task Multi-View learning based on Common and Special Features (MTMVCSF)
An anti-noise multi-task multi-view algorithm called AN-MTMVCSF is proposed, which has a strong adaptability to noise labels.
The effectiveness of these algorithms is proved by a series of well-designed experiments on both real world and synthetic data.
arXiv Detail & Related papers (2022-01-15T11:26:28Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - ASM2TV: An Adaptive Semi-Supervised Multi-Task Multi-View Learning
Framework [7.64589466094347]
Human activity recognition (HAR) in the Internet of Things can be formalized as a multi-task multi-view learning problem.
We introduce a novel framework ASM2TV for semi-supervised multi-task multi-view learning.
arXiv Detail & Related papers (2021-05-18T16:15:32Z) - Generalized Multi-view Shared Subspace Learning using View Bootstrapping [43.027427742165095]
Key objective in multi-view learning is to model the information common to multiple parallel views of a class of objects/events to improve downstream learning tasks.
We present a neural method based on multi-view correlation to capture the information shared across a large number of views by subsampling them in a view-agnostic manner during training.
Experiments on spoken word recognition, 3D object classification and pose-invariant face recognition demonstrate the robustness of view bootstrapping to model a large number of views.
arXiv Detail & Related papers (2020-05-12T20:35:14Z) - Exploit Clues from Views: Self-Supervised and Regularized Learning for
Multiview Object Recognition [66.87417785210772]
This work investigates the problem of multiview self-supervised learning (MV-SSL)
A novel surrogate task for self-supervised learning is proposed by pursuing "object invariant" representation.
Experiments shows that the recognition and retrieval results using view invariant prototype embedding (VISPE) outperform other self-supervised learning methods.
arXiv Detail & Related papers (2020-03-28T07:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.