Prototypical context-aware dynamics generalization for high-dimensional
model-based reinforcement learning
- URL: http://arxiv.org/abs/2211.12774v1
- Date: Wed, 23 Nov 2022 08:42:59 GMT
- Title: Prototypical context-aware dynamics generalization for high-dimensional
model-based reinforcement learning
- Authors: Junjie Wang, Yao Mu, Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng
Zhuang, Ping Luo, Bin Wang, Jianye Hao
- Abstract summary: We propose a Prototypical Context-Aware Dynamics (ProtoCAD) model, which captures the local dynamics by time consistent latent context.
ProtoCAD delivers 13.2% and 26.7% better mean and median performance across all dynamics generalization tasks.
- Score: 40.88574224514982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The latent world model provides a promising way to learn policies in a
compact latent space for tasks with high-dimensional observations, however, its
generalization across diverse environments with unseen dynamics remains
challenging. Although the recurrent structure utilized in current advances
helps to capture local dynamics, modeling only state transitions without an
explicit understanding of environmental context limits the generalization
ability of the dynamics model. To address this issue, we propose a Prototypical
Context-Aware Dynamics (ProtoCAD) model, which captures the local dynamics by
time consistent latent context and enables dynamics generalization in
high-dimensional control tasks. ProtoCAD extracts useful contextual information
with the help of the prototypes clustered over batch and benefits model-based
RL in two folds: 1) It utilizes a temporally consistent prototypical
regularizer that encourages the prototype assignments produced for different
time parts of the same latent trajectory to be temporally consistent instead of
comparing the features; 2) A context representation is designed which combines
both the projection embedding of latent states and aggregated prototypes and
can significantly improve the dynamics generalization ability. Extensive
experiments show that ProtoCAD surpasses existing methods in terms of dynamics
generalization. Compared with the recurrent-based model RSSM, ProtoCAD delivers
13.2% and 26.7% better mean and median performance across all dynamics
generalization tasks.
Related papers
- Persistent Topological Features in Large Language Models [0.6597195879147556]
We introduce persistence similarity, a new metric that quantifies the persistence and transformation of topological features.
Unlike traditional similarity measures, our approach captures the entire evolutionary trajectory of these features.
As a practical application, we leverage persistence similarity to identify and prune redundant layers.
arXiv Detail & Related papers (2024-10-14T19:46:23Z) - Generalization of Auto-Regressive Hidden Markov Models to Non-Linear
Dynamics and Unit Quaternion Observation Space [2.055949720959582]
We propose two generalizations of the Auto-Regressive Hidden Markov Model.
Although this extension is proposed for the ARHMM, it can be easily extended to other latent variable models with AR dynamics in the observed space.
arXiv Detail & Related papers (2023-02-23T07:46:24Z) - Causal Dynamics Learning for Task-Independent State Abstraction [61.707048209272884]
We introduce Causal Dynamics Learning for Task-Independent State Abstraction (CDL)
CDL learns a theoretically proved causal dynamics model that removes unnecessary dependencies between state variables and the action.
A state abstraction can then be derived from the learned dynamics.
arXiv Detail & Related papers (2022-06-27T17:02:53Z) - Temporal Predictive Coding For Model-Based Planning In Latent Space [80.99554006174093]
We present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time.
We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task.
arXiv Detail & Related papers (2021-06-14T04:31:15Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Context-aware Dynamics Model for Generalization in Model-Based
Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it.
In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics.
The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z) - Relational State-Space Model for Stochastic Multi-Object Systems [24.234120525358456]
This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model.
R-SSM makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects.
The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets.
arXiv Detail & Related papers (2020-01-13T03:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.