Mutual Information Maximization for Robust Plannable Representations
- URL: http://arxiv.org/abs/2005.08114v1
- Date: Sat, 16 May 2020 21:58:47 GMT
- Title: Mutual Information Maximization for Robust Plannable Representations
- Authors: Yiming Ding, Ignasi Clavera, Pieter Abbeel
- Abstract summary: We present MIRO, an information theoretic representational learning algorithm for model-based reinforcement learning.
We show that our approach is more robust than reconstruction objectives in the presence of distractors and cluttered scenes.
- Score: 82.83676853746742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extending the capabilities of robotics to real-world complex, unstructured
environments requires the need of developing better perception systems while
maintaining low sample complexity. When dealing with high-dimensional state
spaces, current methods are either model-free or model-based based on
reconstruction objectives. The sample inefficiency of the former constitutes a
major barrier for applying them to the real-world. The later, while they
present low sample complexity, they learn latent spaces that need to
reconstruct every single detail of the scene. In real environments, the task
typically just represents a small fraction of the scene. Reconstruction
objectives suffer in such scenarios as they capture all the unnecessary
components. In this work, we present MIRO, an information theoretic
representational learning algorithm for model-based reinforcement learning. We
design a latent space that maximizes the mutual information with the future
information while being able to capture all the information needed for
planning. We show that our approach is more robust than reconstruction
objectives in the presence of distractors and cluttered scenes
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Reusable Architecture Growth for Continual Stereo Matching [92.36221737921274]
We introduce a Reusable Architecture Growth (RAG) framework to learn new scenes continually in both supervised and self-supervised manners.
RAG can maintain high reusability during growth by reusing previous units while obtaining good performance.
We also present a Scene Router module to adaptively select the scene-specific architecture path at inference.
arXiv Detail & Related papers (2024-03-30T13:24:58Z) - Learning with a Mole: Transferable latent spatial representations for
navigation without reconstruction [12.845774297648736]
In most end-to-end learning approaches the representation is latent and usually does not have a clearly defined interpretation.
In this work we propose to learn an actionable representation of the scene independently of the targeted downstream task.
The learned representation is optimized by a blind auxiliary agent trained to navigate with it on multiple short sub episodes branching out from a waypoint.
arXiv Detail & Related papers (2023-06-06T16:51:43Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Curious Exploration via Structured World Models Yields Zero-Shot Object
Manipulation [19.840186443344]
We propose to use structured world models to incorporate inductive biases in the control loop to achieve sample-efficient exploration.
Our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time.
arXiv Detail & Related papers (2022-06-22T22:08:50Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Temporal Predictive Coding For Model-Based Planning In Latent Space [80.99554006174093]
We present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time.
We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task.
arXiv Detail & Related papers (2021-06-14T04:31:15Z) - On the Transfer of Disentangled Representations in Realistic Settings [44.367245337475445]
We introduce a new high-resolution dataset with 1M simulated images and over 1,800 annotated real-world images.
We propose new architectures in order to scale disentangled representation learning to realistic high-resolution settings.
arXiv Detail & Related papers (2020-10-27T16:15:24Z) - Learning Neural-Symbolic Descriptive Planning Models via Cube-Space
Priors: The Voyage Home (to STRIPS) [13.141761152863868]
We show that our neuro-symbolic architecture is trained end-to-end to produce a succinct and effective discrete state transition model from images alone.
Our target representation is already in a form that off-the-shelf solvers can consume, and opens the door to the rich array of modern search capabilities.
arXiv Detail & Related papers (2020-04-27T15:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.