Learning Dynamics Models for Model Predictive Agents
- URL: http://arxiv.org/abs/2109.14311v1
- Date: Wed, 29 Sep 2021 09:50:25 GMT
- Title: Learning Dynamics Models for Model Predictive Agents
- Authors: Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel
Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa
- Abstract summary: Model-Based Reinforcement Learning involves learning a textitdynamics model from data, and then using this model to optimise behaviour.
This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model.
- Score: 28.063080817465934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-Based Reinforcement Learning involves learning a \textit{dynamics
model} from data, and then using this model to optimise behaviour, most often
with an online \textit{planner}. Much of the recent research along these lines
presents a particular set of design choices, involving problem definition,
model learning and planning. Given the multiple contributions, it is difficult
to evaluate the effects of each. This paper sets out to disambiguate the role
of different design choices for learning dynamics models, by comparing their
performance to planning with a ground-truth model -- the simulator. First, we
collect a rich dataset from the training sequence of a model-free agent on 5
domains of the DeepMind Control Suite. Second, we train feed-forward dynamics
models in a supervised fashion, and evaluate planner performance while varying
and analysing different model design choices, including ensembling,
stochasticity, multi-step training and timestep size. Besides the quantitative
analysis, we describe a set of qualitative findings, rules of thumb, and future
research directions for planning with learned dynamics models. Videos of the
results are available at https://sites.google.com/view/learning-better-models.
Related papers
- Learning-based Models for Vulnerability Detection: An Extensive Study [3.1317409221921144]
We extensively and comprehensively investigate two types of state-of-the-art learning-based approaches.
We experimentally demonstrate the priority of sequence-based models and the limited abilities of both graph-based models.
arXiv Detail & Related papers (2024-08-14T13:01:30Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - An Empirical Study of Deep Learning Models for Vulnerability Detection [4.243592852049963]
We surveyed and reproduced 9 state-of-the-art deep learning models on 2 widely used vulnerability detection datasets.
We investigated model capabilities, training data, and model interpretation.
Our findings can help better understand model results, provide guidance on preparing training data, and improve the robustness of the models.
arXiv Detail & Related papers (2022-12-15T19:49:34Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks [53.09649785009528]
In this paper, we explore a paradigm that does not require training to obtain new models.
Similar to the birth of CNN inspired by receptive fields in the biological visual system, we propose Model Disassembling and Assembling.
For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task.
arXiv Detail & Related papers (2022-03-25T05:27:28Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z) - Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An
Information-Theoretic Framework [20.28063653485698]
We provide a comprehensive framework for deep learning model design and information-theoretic analysis.
First, we develop and demonstrate a novel interactively-connected deep recurrent neural network (I$2$DRNN) model.
Second, to theoretically prove that our designed model can learn multi-scale-temporal dependency in PSTA tasks, we provide an information-theoretic analysis.
arXiv Detail & Related papers (2020-09-14T10:05:14Z) - Reinforcement Learning based dynamic weighing of Ensemble Models for
Time Series Forecasting [0.8399688944263843]
It is known that if models selected for data modelling are distinct (linear/non-linear, static/dynamic) and independent (minimally correlated) models, the accuracy of the predictions is improved.
Various approaches suggested in the literature to weigh the ensemble models use a static set of weights.
To address this issue, a Reinforcement Learning (RL) approach to dynamically assign and update weights of each of the models at different time instants.
arXiv Detail & Related papers (2020-08-20T10:40:42Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Learning Predictive Representations for Deformable Objects Using
Contrastive Estimation [83.16948429592621]
We propose a new learning framework that jointly optimize both the visual representation model and the dynamics model.
We show substantial improvements over standard model-based learning techniques across our rope and cloth manipulation suite.
arXiv Detail & Related papers (2020-03-11T17:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.