Curious Representation Learning for Embodied Intelligence
- URL: http://arxiv.org/abs/2105.01060v1
- Date: Mon, 3 May 2021 17:59:20 GMT
- Title: Curious Representation Learning for Embodied Intelligence
- Authors: Yilun Du, Chuang Gan, Phillip Isola
- Abstract summary: Self-supervised representation learning has achieved remarkable success in recent years.
Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn from environments.
We propose a framework, curious representation learning, which jointly learns a reinforcement learning policy and a visual representation model.
- Score: 81.21764276106924
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Self-supervised representation learning has achieved remarkable success in
recent years. By subverting the need for supervised labels, such approaches are
able to utilize the numerous unlabeled images that exist on the Internet and in
photographic datasets. Yet to build truly intelligent agents, we must construct
representation learning algorithms that can learn not only from datasets but
also learn from environments. An agent in a natural environment will not
typically be fed curated data. Instead, it must explore its environment to
acquire the data it will learn from. We propose a framework, curious
representation learning (CRL), which jointly learns a reinforcement learning
policy and a visual representation model. The policy is trained to maximize the
error of the representation learner, and in doing so is incentivized to explore
its environment. At the same time, the learned representation becomes stronger
and stronger as the policy feeds it ever harder data to learn from. Our learned
representations enable promising transfer to downstream navigation tasks,
performing better than or comparably to ImageNet pretraining without using any
supervision at all. In addition, despite being trained in simulation, our
learned representations can obtain interpretable results on real images.
Related papers
- Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control [73.6361029556484]
Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs.
We consider pre-trained text-to-image diffusion models, which are explicitly optimized to generate images from text prompts.
We show that Stable Control Representations enable learning policies that exhibit state-of-the-art performance on OVMM, a difficult open-vocabulary navigation benchmark.
arXiv Detail & Related papers (2024-05-09T15:39:54Z) - Reinforcement Learning from Passive Data via Latent Intentions [86.4969514480008]
We show that passive data can still be used to learn features that accelerate downstream RL.
Our approach learns from passive data by modeling intentions.
Our experiments demonstrate the ability to learn from many forms of passive data, including cross-embodiment video data and YouTube videos.
arXiv Detail & Related papers (2023-04-10T17:59:05Z) - Palm up: Playing in the Latent Manifold for Unsupervised Pretraining [31.92145741769497]
We propose an algorithm that exhibits an exploratory behavior whilst it utilizes large diverse datasets.
Our key idea is to leverage deep generative models that are pretrained on static datasets and introduce a dynamic model in the latent space.
We then employ an unsupervised reinforcement learning algorithm to explore in this environment and perform unsupervised representation learning on the collected data.
arXiv Detail & Related papers (2022-10-19T22:26:12Z) - Self-supervised Learning for Sonar Image Classification [6.1947705963945845]
Self-supervised learning has proved to be a powerful approach to learn image representations without the need of large labeled datasets.
We present pre-training and transfer learning results on real-life sonar image datasets.
arXiv Detail & Related papers (2022-04-20T08:58:35Z) - The Unsurprising Effectiveness of Pre-Trained Vision Models for Control [33.30717429522186]
We study the role of pre-trained visual representations for control, and in particular representations trained on large-scale computer vision datasets.
We find that pre-trained visual representations can be competitive or even better than ground-truth state representations to train control policies.
arXiv Detail & Related papers (2022-03-07T18:26:14Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - AugNet: End-to-End Unsupervised Visual Representation Learning with
Image Augmentation [3.6790362352712873]
We propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.
Our experiments demonstrate that the method is able to represent the image in low dimensional space.
Unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets.
arXiv Detail & Related papers (2021-06-11T09:02:30Z) - Scaling Up Visual and Vision-Language Representation Learning With Noisy
Text Supervision [57.031588264841]
We leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps.
A simple dual-encoder architecture learns to align visual and language representations of the image and text pairs using a contrastive loss.
We show that the scale of our corpus can make up for its noise and leads to state-of-the-art representations even with such a simple learning scheme.
arXiv Detail & Related papers (2021-02-11T10:08:12Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.