Curious Exploration via Structured World Models Yields Zero-Shot Object
Manipulation
- URL: http://arxiv.org/abs/2206.11403v1
- Date: Wed, 22 Jun 2022 22:08:50 GMT
- Title: Curious Exploration via Structured World Models Yields Zero-Shot Object
Manipulation
- Authors: Cansu Sancaktar, Sebastian Blaes, Georg Martius
- Abstract summary: We propose to use structured world models to incorporate inductive biases in the control loop to achieve sample-efficient exploration.
Our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time.
- Score: 19.840186443344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been a long-standing dream to design artificial agents that explore
their environment efficiently via intrinsic motivation, similar to how children
perform curious free play. Despite recent advances in intrinsically motivated
reinforcement learning (RL), sample-efficient exploration in object
manipulation scenarios remains a significant challenge as most of the relevant
information lies in the sparse agent-object and object-object interactions. In
this paper, we propose to use structured world models to incorporate relational
inductive biases in the control loop to achieve sample-efficient and
interaction-rich exploration in compositional multi-object environments. By
planning for future novelty inside structured world models, our method
generates free-play behavior that starts to interact with objects early on and
develops more complex behavior over time. Instead of using models only to
compute intrinsic rewards, as commonly done, our method showcases that the
self-reinforcing cycle between good models and good exploration also opens up
another avenue: zero-shot generalization to downstream tasks via model-based
planning. After the entirely intrinsic task-agnostic exploration phase, our
method solves challenging downstream tasks such as stacking, flipping, pick &
place, and throwing that generalizes to unseen numbers and arrangements of
objects without any additional training.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Visuomotor Mechanical Search: Learning to Retrieve Target Objects in
Clutter [43.668395529368354]
We present a novel Deep RL procedure that combines teacher-aided exploration, ii) a critic with privileged information, andiii) mid-level representations.
Our approach trains faster and converges to more efficient uncovering solutions than baselines and ablations, and that our uncovering policies lead to an average improvement in the graspability of the target object.
arXiv Detail & Related papers (2020-08-13T18:23:00Z) - Learning Long-term Visual Dynamics with Region Proposal Interaction
Networks [75.06423516419862]
We build object representations that can capture inter-object and object-environment interactions over a long-range.
Thanks to the simple yet effective object representation, our approach outperforms prior methods by a significant margin.
arXiv Detail & Related papers (2020-08-05T17:48:00Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Learning intuitive physics and one-shot imitation using
state-action-prediction self-organizing maps [0.0]
Humans learn by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks.
We suggest a simple but effective unsupervised model which develops such characteristics.
We demonstrate its performance on a set of several related, but different one-shot imitation tasks, which the agent flexibly solves in an active inference style.
arXiv Detail & Related papers (2020-07-03T12:29:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.