V-Max: A Reinforcement Learning Framework for Autonomous Driving
- URL: http://arxiv.org/abs/2503.08388v3
- Date: Thu, 17 Jul 2025 15:30:27 GMT
- Title: V-Max: A Reinforcement Learning Framework for Autonomous Driving
- Authors: Valentin Charraut, Waƫl Doulazmi, Thomas Tournaire, Thibault Buhet,
- Abstract summary: V-Max is an open research framework providing all the necessary tools to makeReinforcement Learning practical for Autonomous Driving.<n>V-Max is built on Waymax, a hardware-accelerated AD simulator designed for large-scale experimentation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning-based decision-making has the potential to enable generalizable Autonomous Driving (AD) policies, reducing the engineering overhead of rule-based approaches. Imitation Learning (IL) remains the dominant paradigm, benefiting from large-scale human demonstration datasets, but it suffers from inherent limitations such as distribution shift and imitation gaps. Reinforcement Learning (RL) presents a promising alternative, yet its adoption in AD remains limited due to the lack of standardized and efficient research frameworks. To this end, we introduce V-Max, an open research framework providing all the necessary tools to make RL practical for AD. V-Max is built on Waymax, a hardware-accelerated AD simulator designed for large-scale experimentation. We extend it using ScenarioNet's approach, enabling the fast simulation of diverse AD datasets.
Related papers
- ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving [35.493857028919685]
We propose ReCogDrive, an autonomous driving system that integrates Vision-Language Models with diffusion planner.<n>In this paper, we use a large-scale driving question-answering datasets to train the VLMs, mitigating the domain discrepancy between generic content and real-world driving scenarios.<n>In the second stage, we employ a diffusion-based planner to perform imitation learning, mapping representations from the latent language space to continuous driving actions.
arXiv Detail & Related papers (2025-06-09T03:14:04Z) - High-Performance Reinforcement Learning on Spot: Optimizing Simulation Parameters with Distributional Measures [8.437187555622167]
This work presents an overview of the technical details behind a high performance reinforcement learning policy deployment with the Spot RL Researcher Development Kit for low level motor access on Boston Dynamics Spot.
We deploy policies capable of over 5.2ms locomotion, more than triple Spots default controller maximum speed, to slippery surfaces, disturbance rejection, and overall agility previously unseen on Spot.
arXiv Detail & Related papers (2025-04-24T18:01:36Z) - Crossing the Reward Bridge: Expanding RL with Verifiable Rewards Across Diverse Domains [92.36624674516553]
Reinforcement learning with verifiable rewards (RLVR) has demonstrated significant success in enhancing mathematical reasoning and coding performance of large language models (LLMs)
We investigate the effectiveness and scalability of RLVR across diverse real-world domains including medicine, chemistry, psychology, economics, and education.
We utilize a generative scoring technique that yields soft, model-based reward signals to overcome limitations posed by binary verifications.
arXiv Detail & Related papers (2025-03-31T08:22:49Z) - TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.<n>A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z) - Vintix: Action Model via In-Context Reinforcement Learning [72.65703565352769]
We present the first steps toward scaling ICRL by introducing a fixed, cross-domain model capable of learning behaviors through in-context reinforcement learning.<n>Our results demonstrate that Algorithm Distillation, a framework designed to facilitate ICRL, offers a compelling and competitive alternative to expert distillation to construct versatile action models.
arXiv Detail & Related papers (2025-01-31T18:57:08Z) - Application of Multimodal Large Language Models in Autonomous Driving [1.8181868280594944]
We conduct in-depth study on implementing the Multi-modal Large Language Model.<n>We address problems with the poor performance of MLLM on Autonomous Driving.<n>We then break down the AD decision-making process by scene understanding, prediction, and decision-making.
arXiv Detail & Related papers (2024-12-21T00:09:52Z) - Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes [7.028778922533688]
Average-reward Markov decision processes (MDPs) provide a foundational framework for sequential decision-making under uncertainty.<n>We introduce Reward-Extended Differential (or RED) reinforcement learning: a novel RL framework that can be used to effectively and efficiently solve various learning objectives, or subtasks, simultaneously in the average-reward setting.
arXiv Detail & Related papers (2024-10-14T14:52:23Z) - Sample-efficient Imitative Multi-token Decision Transformer for Real-world Driving [18.34685506480288]
We propose Sample-efficient Imitative Multi-token Decision Transformer (SimDT)
SimDT introduces multi-token prediction, online imitative learning pipeline and prioritized experience replay to sequence-modelling reinforcement learning.
Results exceed popular imitation and reinforcement learning algorithms both in open-loop and closed-loop settings on Waymax benchmark.
arXiv Detail & Related papers (2024-06-18T14:27:14Z) - Machine Unlearning of Pre-trained Large Language Models [17.40601262379265]
This study investigates the concept of the right to be forgotten' within the context of large language models (LLMs)
We explore machine unlearning as a pivotal solution, with a focus on pre-trained models.
arXiv Detail & Related papers (2024-02-23T07:43:26Z) - Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems.
LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning.
We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Multi-fidelity reinforcement learning framework for shape optimization [0.8258451067861933]
We introduce a controlled transfer learning framework that leverages a multi-fidelity simulation setting.
Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers.
Our results demonstrate this framework's applicability to other scientific DRL scenarios.
arXiv Detail & Related papers (2022-02-22T20:44:04Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.