Representation Matters: Improving Perception and Exploration for
Robotics
- URL: http://arxiv.org/abs/2011.01758v2
- Date: Sun, 21 Mar 2021 18:31:54 GMT
- Title: Representation Matters: Improving Perception and Exploration for
Robotics
- Authors: Markus Wulfmeier, Arunkumar Byravan, Tim Hertweck, Irina Higgins,
Ankush Gupta, Tejas Kulkarni, Malcolm Reynolds, Denis Teplyashin, Roland
Hafner, Thomas Lampe, Martin Riedmiller
- Abstract summary: We systematically evaluate a number of common learnt and hand-engineered representations in the context of three robotics tasks.
The value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement.
- Score: 16.864646988990547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Projecting high-dimensional environment observations into lower-dimensional
structured representations can considerably improve data-efficiency for
reinforcement learning in domains with limited data such as robotics. Can a
single generally useful representation be found? In order to answer this
question, it is important to understand how the representation will be used by
the agent and what properties such a 'good' representation should have. In this
paper we systematically evaluate a number of common learnt and hand-engineered
representations in the context of three robotics tasks: lifting, stacking and
pushing of 3D blocks. The representations are evaluated in two use-cases: as
input to the agent, or as a source of auxiliary tasks. Furthermore, the value
of each representation is evaluated in terms of three properties:
dimensionality, observability and disentanglement. We can significantly improve
performance in both use-cases and demonstrate that some representations can
perform commensurate to simulator states as agent inputs. Finally, our results
challenge common intuitions by demonstrating that: 1) dimensionality strongly
matters for task generation, but is negligible for inputs, 2) observability of
task-relevant aspects mostly affects the input representation use-case, and 3)
disentanglement leads to better auxiliary tasks, but has only limited benefits
for input representations. This work serves as a step towards a more systematic
understanding of what makes a 'good' representation for control in robotics,
enabling practitioners to make more informed choices for developing new learned
or hand-engineered representations.
Related papers
- Adaptive Language-Guided Abstraction from Contrastive Explanations [53.48583372522492]
It is necessary to determine which features of the environment are relevant before determining how these features should be used to compute reward.
End-to-end methods for joint feature and reward learning often yield brittle reward functions that are sensitive to spurious state features.
This paper describes a method named ALGAE which alternates between using language models to iteratively identify human-meaningful features.
arXiv Detail & Related papers (2024-09-12T16:51:58Z) - What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping [3.4777703321218225]
This work examines the effect of various representations in incentivizing the agent to solve a specific robotic task.
A continuum of state representations is defined, starting from hand-crafted numerical states to encoded image-based representations.
The effects of each representation on the ability of the agent to solve the task in simulation and the transferability of the learned policy to the real robot are examined.
arXiv Detail & Related papers (2023-09-21T11:41:22Z) - Learning in Factored Domains with Information-Constrained Visual
Representations [14.674830543204317]
We present a model of human factored representation learning based on an altered form of a $beta$-Variational Auto-encoder used in a visual learning task.
Results demonstrate a trade-off in the informational complexity of model latent dimension spaces, between the speed of learning and the accuracy of reconstructions.
arXiv Detail & Related papers (2023-03-30T16:22:10Z) - Visuomotor Control in Multi-Object Scenes Using Object-Aware
Representations [25.33452947179541]
We show the effectiveness of object-aware representation learning techniques for robotic tasks.
Our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object techniques.
arXiv Detail & Related papers (2022-05-12T19:48:11Z) - Desiderata for Representation Learning: A Causal Perspective [104.3711759578494]
We take a causal perspective on representation learning, formalizing non-spuriousness and efficiency (in supervised representation learning) and disentanglement (in unsupervised representation learning)
This yields computable metrics that can be used to assess the degree to which representations satisfy the desiderata of interest and learn non-spurious and disentangled representations from single observational datasets.
arXiv Detail & Related papers (2021-09-08T17:33:54Z) - A Tutorial on Learning Disentangled Representations in the Imaging
Domain [13.320565017546985]
Disentangled representation learning has been proposed as an approach to learning general representations.
A good general representation can be readily fine-tuned for new target tasks using modest amounts of data.
Disentangled representations can offer model explainability and can help us understand the underlying causal relations of the factors of variation.
arXiv Detail & Related papers (2021-08-26T21:44:10Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z) - Fairness by Learning Orthogonal Disentangled Representations [50.82638766862974]
We propose a novel disentanglement approach to invariant representation problem.
We enforce the meaningful representation to be agnostic to sensitive information by entropy.
The proposed approach is evaluated on five publicly available datasets.
arXiv Detail & Related papers (2020-03-12T11:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.