Related papers: Steadily Learn to Drive with Virtual Memory

Steadily Learn to Drive with Virtual Memory

URL: http://arxiv.org/abs/2102.08072v1
Date: Tue, 16 Feb 2021 10:46:52 GMT
Title: Steadily Learn to Drive with Virtual Memory
Authors: Yuhang Zhang, Yao Mu, Yujie Yang, Yang Guan, Shengbo Eben Li, Qi Sun and Jianyu Chen
Abstract summary: This paper proposes an algorithm called Learn to drive with Virtual Memory (LVM) to overcome these problems. LVM compresses the high-dimensional information into compact latent states and learns a latent dynamic model to summarize the agent's experience. The effectiveness of LVM is demonstrated by an image-input autonomous driving task.
Score: 11.67256846037979
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning has shown great potential in developing high-level autonomous driving. However, for high-dimensional tasks, current RL methods suffer from low data efficiency and oscillation in the training process. This paper proposes an algorithm called Learn to drive with Virtual Memory (LVM) to overcome these problems. LVM compresses the high-dimensional information into compact latent states and learns a latent dynamic model to summarize the agent's experience. Various imagined latent trajectories are generated as virtual memory by the latent dynamic model. The policy is learned by propagating gradient through the learned latent model with the imagined latent trajectories and thus leads to high data efficiency. Furthermore, a double critic structure is designed to reduce the oscillation during the training process. The effectiveness of LVM is demonstrated by an image-input autonomous driving task, in which LVM outperforms the existing method in terms of data efficiency, learning stability, and control performance.

Related papers

Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking [13.898774643126174]
LeapVAD implements a human-attentional mechanism to identify and focus on critical traffic elements that influence driving decisions. System consists of an Analytic Process (System-II) that accumulates driving experience through logical reasoning and a Heuristic Process (System-I) that refines this knowledge via fine-tuning and few-shot learning.
arXiv Detail & Related papers (2025-01-14T14:49:45Z)
Efficient Training of Large Vision Models via Advanced Automated Progressive Learning [96.71646528053651]
We present an advanced automated progressive learning (AutoProg) framework for efficient training of Large Vision Models (LVMs) We introduce AutoProg-Zero, by enhancing the AutoProg framework with a novel zero-shot unfreezing schedule search. Experiments show that AutoProg accelerates ViT pre-training by up to 1.85x on ImageNet and accelerates fine-tuning of diffusion models by up to 2.86x, with comparable or even higher performance.
arXiv Detail & Related papers (2024-09-06T16:24:24Z)
Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control [1.5361702135159845]
This paper introduces a knowledge-informed model-based residual reinforcement learning framework. It integrates traffic expert knowledge into a virtual environment model, employing the Intelligent Driver Model (IDM) for basic dynamics and neural networks for residual dynamics. We propose a novel strategy that combines traditional control methods with residual RL, facilitating efficient learning and policy optimization without the need to learn from scratch.
arXiv Detail & Related papers (2024-08-30T16:16:57Z)
Simplified Temporal Consistency Reinforcement Learning [19.814047499837084]
We show that a simple representation learning approach relying on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. Our approach outperforms model-free methods by a large margin and matches model-based methods' sample efficiency while training 2.4 times faster.
arXiv Detail & Related papers (2023-06-15T19:37:43Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data. We show that a neural network can model highly nonlinear behaviors accurately for large time horizons. In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z)
Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data. We first train a scale-aware disparity network using both monocular real images and stereo virtual data. The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z)
PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning [84.30765628008207]
We propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning. Our method outperforms the current state-of-the-art methods by a large margin on both benchmarks.
arXiv Detail & Related papers (2021-06-08T07:37:37Z)
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation [91.05073136215886]
"Actor-Learner Distillation" transfers learning progress from a large capacity learner model to a small capacity actor model. We demonstrate in several challenging memory environments that using Actor-Learner Distillation recovers the clear sample-efficiency gains of the transformer learner model.
arXiv Detail & Related papers (2021-04-04T17:56:34Z)
Learning hierarchical behavior and motion planning for autonomous driving [32.78069835190924]
We introduce hierarchical behavior and motion planning (HBMP) to explicitly model the behavior in learning-based solution. We transform HBMP problem by integrating a classical sampling-based motion planner. In addition, we propose a sharable representation for input sensory data across simulation platforms and real-world environment.
arXiv Detail & Related papers (2020-05-08T05:34:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.