Isolating and Leveraging Controllable and Noncontrollable Visual
Dynamics in World Models
- URL: http://arxiv.org/abs/2205.13817v1
- Date: Fri, 27 May 2022 08:07:39 GMT
- Title: Isolating and Leveraging Controllable and Noncontrollable Visual
Dynamics in World Models
- Authors: Minting Pan, Xiangming Zhu, Yunbo Wang, Xiaokang Yang
- Abstract summary: We present Iso-Dream, which improves the Dream-to-Control framework in two aspects.
First, by optimizing inverse dynamics, we encourage world model to learn controllable and noncontrollable sources.
Second, we optimize the behavior of the agent on the decoupled latent imaginations of the world model.
- Score: 65.97707691164558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: World models learn the consequences of actions in vision-based interactive
systems. However, in practical scenarios such as autonomous driving, there
commonly exists noncontrollable dynamics independent of the action signals,
making it difficult to learn effective world models. To tackle this problem, we
present a novel reinforcement learning approach named Iso-Dream, which improves
the Dream-to-Control framework in two aspects. First, by optimizing the inverse
dynamics, we encourage the world model to learn controllable and
noncontrollable sources of spatiotemporal changes on isolated state transition
branches. Second, we optimize the behavior of the agent on the decoupled latent
imaginations of the world model. Specifically, to estimate state values, we
roll-out the noncontrollable states into the future and associate them with the
current controllable state. In this way, the isolation of dynamics sources can
greatly benefit long-horizon decision-making of the agent, such as a
self-driving car that can avoid potential risks by anticipating the movement of
other vehicles. Experiments show that Iso-Dream is effective in decoupling the
mixed dynamics and remarkably outperforms existing approaches in a wide range
of visual control and prediction domains.
Related papers
- Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey [61.39993881402787]
World models and video generation are pivotal technologies in the domain of autonomous driving.
This paper investigates the relationship between these two technologies.
By analyzing the interplay between video generation and world models, this survey identifies critical challenges and future research directions.
arXiv Detail & Related papers (2024-11-05T08:58:35Z) - Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models [60.87795376541144]
A world model is a neural network capable of predicting an agent's next state given past states and actions.
During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations.
We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing.
arXiv Detail & Related papers (2024-09-25T06:48:25Z) - Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving.
Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z) - Simplifying Latent Dynamics with Softly State-Invariant World Models [10.722955763425228]
We introduce the Parsimonious Latent Space Model (PLSM), a world model that regularizes the latent dynamics to make the effect of the agent's actions more predictable.
We find that our regularization improves accuracy, generalization, and performance in downstream tasks.
arXiv Detail & Related papers (2024-01-31T13:52:11Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - GAIA-1: A Generative World Model for Autonomous Driving [9.578453700755318]
We introduce GAIA-1 ('Generative AI for Autonomy'), a generative world model that generates realistic driving scenarios.
Emerging properties from our model include learning high-level structures and scene dynamics, contextual awareness, generalization, and understanding of geometry.
arXiv Detail & Related papers (2023-09-29T09:20:37Z) - Model-Based Reinforcement Learning with Isolated Imaginations [61.67183143982074]
We propose Iso-Dream++, a model-based reinforcement learning approach.
We perform policy optimization based on the decoupled latent imaginations.
This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild.
arXiv Detail & Related papers (2023-03-27T02:55:56Z) - Cycle-Consistent World Models for Domain Independent Latent Imagination [0.0]
High costs and risks make it hard to train autonomous cars in the real world.
We propose a novel model-based reinforcement learning approach called Cycleconsistent World Models.
arXiv Detail & Related papers (2021-10-02T13:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.