Related papers: An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving

An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving

URL: http://arxiv.org/abs/2409.10554v1
Date: Mon, 2 Sep 2024 14:16:23 GMT
Title: An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving
Authors: Shawan Mohammed, Alp Argun, Nicolas Bonnotte, Gerd Ascheid,
Abstract summary: Research investigates the challenges Deep Reinforcement Learning (DRL) faces in Partially Observable Markov Decision Processes (POMDP) Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations. We show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Our research investigates the challenges Deep Reinforcement Learning (DRL) faces in complex, Partially Observable Markov Decision Processes (POMDP) such as autonomous driving (AD), and proposes a solution for vision-based navigation in these environments. Partial observability reduces RL performance significantly, and this can be mitigated by augmenting sensor information and data fusion to reflect a more Markovian environment. However, this necessitates an increasingly complex perception module, whose training via RL is complicated due to inherent limitations. As the neural network architecture becomes more complex, the reward function's effectiveness as an error signal diminishes since the only source of supervision is the reward, which is often noisy, sparse, and delayed. Task-irrelevant elements in images, such as the sky or certain objects, pose additional complexities. Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations. Then, we train a head network on top of these representations through DRL to learn to control an ego vehicle in the CARLA AD simulator. This study presents a broad investigation of the impact of different learning schemes for offline-training of encoders on the performance of DRL agents in challenging AD tasks. Furthermore, we show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator, in a zero-shot learning fashion. Finally, we explore the impact of various architectural decisions for the RL networks to utilize the transferred representations efficiently. Therefore, in this work, we introduce and validate an optimal way for obtaining suitable representations of the environment, and transferring them to RL networks.

Related papers

From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning [59.88543114325153]
We introduce the Seeing-to-Experiencing framework to scale the capability of navigation foundation models with reinforcement learning.<n>S2E combines the strengths of pre-training on videos and post-training through RL.<n>We establish a comprehensive end-to-end evaluation benchmark, NavBench-GS, built on photorealistic 3DGS reconstructions of real-world scenes.
arXiv Detail & Related papers (2025-07-29T17:26:10Z)
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 [53.894789613838654]
We introduce SEED-Bench-R1, a benchmark designed to evaluate post-training methods for MLLMs in video understanding. It includes intricate real-world videos and complex everyday planning tasks in the format of multiple-choice questions. Using Qwen2-VL-Instruct-7B as a base model, we compare RL with supervised fine-tuning (SFT) Our detailed analysis reveals that RL enhances visual perception but often produces less coherent reasoning chains.
arXiv Detail & Related papers (2025-03-31T17:55:23Z)
SFO: Piloting VLM Feedback for Offline RL [1.3597551064547502]
Vision-Language Models (VLMs) are limited in their ability to solve control tasks due to their lack of action-conditioned training data. A key challenge in Reinforcement Learning from AI Feedback is determining how best to integrate VLM-derived signals into the learning process. We propose a simple yet effective approach--filtered and weighted behavior cloning--consistently outperforms more complex reinforcement learning from human feedback-based methods.
arXiv Detail & Related papers (2025-03-02T23:52:46Z)
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning [3.8309622155866583]
We introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that transforms the classic 8-tile puzzle into a visual reinforcement learning task with images drawn from arbitrarily large datasets.<n>SPGym's key innovation lies in its ability to precisely control representation learning complexity through adjustable grid sizes and image pools.
arXiv Detail & Related papers (2024-10-17T21:23:03Z)
In-context Learning for Automated Driving Scenarios [15.325910109153616]
One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way.
arXiv Detail & Related papers (2024-05-07T09:04:52Z)
M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL) Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms. We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z)
Privileged to Predicted: Towards Sensorimotor Reinforcement Learning for Urban Driving [0.0]
Reinforcement Learning (RL) has the potential to surpass human performance in driving without needing any expert supervision. We propose vision-based deep learning models to approximate the privileged representations from sensor data. We shed light on the significance of the state representations in RL for autonomous driving and point to unresolved challenges for future research.
arXiv Detail & Related papers (2023-09-18T13:34:41Z)
Accelerating exploration and representation learning with offline pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z)
Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information [49.06422815335159]
Learning to control an agent from data collected offline is vital for real-world applications of reinforcement learning (RL) This paper introduces offline RL benchmarks offering the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process.
arXiv Detail & Related papers (2022-10-31T22:12:48Z)
Architecting and Visualizing Deep Reinforcement Learning Models [77.34726150561087]
Deep Reinforcement Learning (DRL) is a theory that aims to teach computers how to communicate with each other. In this paper, we present a new Atari Pong game environment, a policy gradient based DRL model, a real-time network visualization, and an interactive display to help build intuition and awareness of the mechanics of DRL inference.
arXiv Detail & Related papers (2021-12-02T17:48:26Z)
Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel. On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z)
Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions. We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces. Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z)
Deep Surrogate Q-Learning for Autonomous Driving [17.30342128504405]
We propose Surrogate Q-learning for learning lane-change behavior for autonomous driving. We show that the architecture leads to a novel replay sampling technique we call Scene-centric Experience Replay. We also show that our methods enhance real-world applicability of RL systems by learning policies on the real highD dataset.
arXiv Detail & Related papers (2020-10-21T19:49:06Z)
Decoupling Representation Learning from Reinforcement Learning [89.82834016009461]
We introduce an unsupervised learning task called Augmented Temporal Contrast (ATC) ATC trains a convolutional encoder to associate pairs of observations separated by a short time difference. In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL.
arXiv Detail & Related papers (2020-09-14T19:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.