LORD: Large Models based Opposite Reward Design for Autonomous Driving
- URL: http://arxiv.org/abs/2403.18965v1
- Date: Wed, 27 Mar 2024 19:30:06 GMT
- Title: LORD: Large Models based Opposite Reward Design for Autonomous Driving
- Authors: Xin Ye, Feng Tao, Abhirup Mallik, Burhaneddin Yaman, Liu Ren,
- Abstract summary: We introduce LORD, a novel large models based opposite reward design through undesired linguistic goals.
Our proposed framework shows its efficiency in leveraging the power of large pretrained models for achieving safe and enhanced autonomous driving.
- Score: 11.717821043996352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) based autonomous driving has emerged as a promising alternative to data-driven imitation learning approaches. However, crafting effective reward functions for RL poses challenges due to the complexity of defining and quantifying good driving behaviors across diverse scenarios. Recently, large pretrained models have gained significant attention as zero-shot reward models for tasks specified with desired linguistic goals. However, the desired linguistic goals for autonomous driving such as "drive safely" are ambiguous and incomprehensible by pretrained models. On the other hand, undesired linguistic goals like "collision" are more concrete and tractable. In this work, we introduce LORD, a novel large models based opposite reward design through undesired linguistic goals to enable the efficient use of large pretrained models as zero-shot reward models. Through extensive experiments, our proposed framework shows its efficiency in leveraging the power of large pretrained models for achieving safe and enhanced autonomous driving. Moreover, the proposed approach shows improved generalization capabilities as it outperforms counterpart methods across diverse and challenging driving scenarios.
Related papers
- On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - RACER: Rational Artificial Intelligence Car-following-model Enhanced by
Reality [51.244807332133696]
This paper introduces RACER, a cutting-edge deep learning car-following model to predict Adaptive Cruise Control (ACC) driving behavior.
Unlike conventional models, RACER effectively integrates Rational Driving Constraints (RDCs), crucial tenets of actual driving.
RACER excels across key metrics, such as acceleration, velocity, and spacing, registering zero violations.
arXiv Detail & Related papers (2023-12-12T06:21:30Z) - Beyond One Model Fits All: Ensemble Deep Learning for Autonomous
Vehicles [16.398646583844286]
This study introduces three distinct neural network models corresponding to Mediated Perception, Behavior Reflex, and Direct Perception approaches.
Our architecture fuses information from the base, future latent vector prediction, and auxiliary task networks, using global routing commands to select appropriate action sub-networks.
arXiv Detail & Related papers (2023-12-10T04:40:02Z) - Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - Formulation and validation of a car-following model based on deep
reinforcement learning [0.0]
We propose and validate a novel car following model based on deep reinforcement learning.
Our model is trained to maximize externally given reward functions for the free and car-following regimes.
The parameters of these reward functions resemble that of traditional models such as the Intelligent Driver Model.
arXiv Detail & Related papers (2021-09-29T08:27:12Z) - Model-based versus Model-free Deep Reinforcement Learning for Autonomous
Racing Cars [46.64253693115981]
This paper investigates how model-based deep reinforcement learning agents generalize to real-world autonomous-vehicle control-tasks.
We show that model-based agents capable of learning in imagination, substantially outperform model-free agents with respect to performance, sample efficiency, successful task completion, and generalization.
arXiv Detail & Related papers (2021-03-08T17:15:23Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - A Probabilistic Framework for Imitating Human Race Driver Behavior [31.524303667746643]
We propose Probabilistic Modeling of Driver behavior (ProMoD), a modular framework which splits the task of driver behavior modeling into multiple modules.
A global target trajectory distribution is learned with Probabilistic Movement Primitives, clothoids are utilized for local path generation, and the corresponding choice of actions is performed by a neural network.
Experiments in a simulated car racing setting show considerable advantages in imitation accuracy and robustness compared to other imitation learning algorithms.
arXiv Detail & Related papers (2020-01-22T20:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.