Related papers: Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning

Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning

URL: http://arxiv.org/abs/2407.02217v1
Date: Tue, 2 Jul 2024 12:32:57 GMT
Title: Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning
Authors: Zakariae El Asri, Olivier Sigaud, Nicolas Thome,
Abstract summary: Applying reinforcement learning to real-world applications requires addressing a trade-off between performance, sample efficiency, and inference time. In this work, we demonstrate how to address this triple challenge by leveraging partial physical knowledge about the system dynamics.
Score: 20.938465516348177
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Applying reinforcement learning (RL) to real-world applications requires addressing a trade-off between asymptotic performance, sample efficiency, and inference time. In this work, we demonstrate how to address this triple challenge by leveraging partial physical knowledge about the system dynamics. Our approach involves learning a physics-informed model to boost sample efficiency and generating imaginary trajectories from this model to learn a model-free policy and Q-function. Furthermore, we propose a hybrid planning strategy, combining the learned policy and Q-function with the learned model to enhance time efficiency in planning. Through practical demonstrations, we illustrate that our method improves the compromise between sample efficiency, time efficiency, and performance over state-of-the-art methods.

Related papers

AdaWM: Adaptive World Model based Planning for Autonomous Driving [34.57859869929471]
World model based reinforcement learning (RL) has emerged as a promising approach for autonomous driving. The pretrain-finetune paradigm is often used, where online RL is performance by a pretrained model and a policy learned offline. We introduce AdaWM, an Adaptive World Model based planning method, featuring two key steps: (a) mismatch identification, which quantifies the mismatches and informs the finetuning strategy, and (b) alignment-driven finetuning, which selectively updates either the policy or the model as needed.
arXiv Detail & Related papers (2025-01-22T18:34:51Z)
Reward-free World Models for Online Imitation Learning [25.304836126280424]
We propose a novel approach to online imitation learning that leverages reward-free world models. Our method learns environmental dynamics entirely in latent spaces without reconstruction, enabling efficient and accurate modeling. We evaluate our method on a diverse set of benchmarks, including DMControl, MyoSuite, and ManiSkill2, demonstrating superior empirical performance compared to existing approaches.
arXiv Detail & Related papers (2024-10-17T23:13:32Z)
Meta-Gradient Search Control: A Method for Improving the Efficiency of Dyna-style Planning [8.552540426753]
This paper introduces an online, meta-gradient algorithm that tunes a probability with which states are queried during Dyna-style planning. Results indicate that our method improves efficiency of the planning process.
arXiv Detail & Related papers (2024-06-27T22:24:46Z)
Simplified Temporal Consistency Reinforcement Learning [19.814047499837084]
We show that a simple representation learning approach relying on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. Our approach outperforms model-free methods by a large margin and matches model-based methods' sample efficiency while training 2.4 times faster.
arXiv Detail & Related papers (2023-06-15T19:37:43Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Model-Based Reinforcement Learning with SINDy [0.0]
We propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL) We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories than state of the art model learning algorithms.
arXiv Detail & Related papers (2022-08-30T19:03:48Z)
Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem. The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z)
Evaluating model-based planning and planner amortization for continuous control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning. We find that well-tuned model-free agents are strong baselines even for high DoF control problems. We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z)
Planning from Pixels using Inverse Dynamics Models [44.16528631970381]
We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
arXiv Detail & Related papers (2020-12-04T06:07:36Z)
Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning [72.18725551199842]
We propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD) It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories. We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.
arXiv Detail & Related papers (2020-10-23T03:22:01Z)
Model-Augmented Actor-Critic: Backpropagating through Paths [81.86992776864729]
Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator. We show how to make more effective use of the model by exploiting its differentiability.
arXiv Detail & Related papers (2020-05-16T19:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.