Related papers: In-context Learning for Automated Driving Scenarios

In-context Learning for Automated Driving Scenarios

URL: http://arxiv.org/abs/2405.04135v1
Date: Tue, 7 May 2024 09:04:52 GMT
Title: In-context Learning for Automated Driving Scenarios
Authors: Ziqi Zhou, Jingyue Zhang, Jingyuan Zhang, Boyue Wang, Tianyu Shi, Alaa Khamis,
Abstract summary: One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way.
Score: 15.325910109153616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way. We developed a framework where instructions and dynamic environment descriptions are input into the LLM. The LLM then utilizes this information to assist in generating rewards, thereby steering the behavior of RL agents towards patterns that more closely resemble human driving. The experimental results demonstrate that this approach not only makes RL agents more anthropomorphic but also reaches better performance. Additionally, various strategies for reward-proxy and reward-shaping are investigated, revealing the significant impact of prompt design on shaping an AD vehicle's behavior. These findings offer a promising direction for the development of more advanced and human-like automated driving systems. Our experimental data and source code can be found here.

Related papers

Automating Traffic Model Enhancement with AI Research Agent [4.420199777075044]
Traffic Research Agent (TR-Agent) is an AI-driven system designed to autonomously develop and refine traffic models. TR-Agent achieves significant performance improvements across multiple traffic models. To further support research and collaboration, we have open-sourced both the code and data used in our experiments.
arXiv Detail & Related papers (2024-09-25T12:42:25Z)
An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving [0.0]
Research investigates the challenges Deep Reinforcement Learning (DRL) faces in Partially Observable Markov Decision Processes (POMDP) Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations. We show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator.
arXiv Detail & Related papers (2024-09-02T14:16:23Z)
Generating and Evolving Reward Functions for Highway Driving with Large Language Models [18.464822261908562]
Reinforcement Learning (RL) plays a crucial role in advancing autonomous driving technologies. We introduce a novel framework that integrates Large Language Models (LLMs) with RL to improve reward function design in autonomous driving.
arXiv Detail & Related papers (2024-06-15T07:50:10Z)
REvolve: Reward Evolution with Large Language Models using Human Feedback [6.4550546442058225]
Large language models (LLMs) have been used for reward generation from natural language task descriptions. LLMs, guided by human feedback, can be used to formulate reward functions that reflect human implicit knowledge. We introduce REvolve, a truly evolutionary framework that uses LLMs for reward design in reinforcement learning.
arXiv Detail & Related papers (2024-06-03T13:23:27Z)
Ego-Foresight: Agent Visuomotor Prediction as Regularization for RL [34.6883445484835]
Ego-Foresight is a self-supervised method for disentangling agent and environment based on motion and prediction. We show that visuomotor prediction of the agent provides regularization to the RL algorithm, by encouraging the actions to stay within predictable bounds. We integrate Ego-Foresight with a model-free RL algorithm to solve simulated robotic manipulation tasks, showing an average improvement of 23% in efficiency and 8% in performance.
arXiv Detail & Related papers (2024-05-27T13:32:43Z)
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment. We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation. These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z)
HAIM-DRL: Enhanced Human-in-the-loop Reinforcement Learning for Safe and Efficient Autonomous Driving [2.807187711407621]
We propose an enhanced human-in-the-loop reinforcement learning method, termed the Human as AI mentor-based deep reinforcement learning (HAIM-DRL) framework. We first introduce an innovative learning paradigm that effectively injects human intelligence into AI, termed Human as AI mentor (HAIM) In this paradigm, the human expert serves as a mentor to the AI agent, while the agent could be guided to minimize traffic flow disturbance.
arXiv Detail & Related papers (2024-01-06T08:30:14Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision. We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z)
Constrained Reinforcement Learning for Robotics via Scenario-Based Programming [64.07167316957533]
It is crucial to optimize the performance of DRL-based agents while providing guarantees about their behavior. This paper presents a novel technique for incorporating domain-expert knowledge into a constrained DRL training loop. Our experiments demonstrate that using our approach to leverage expert knowledge dramatically improves the safety and the performance of the agent.
arXiv Detail & Related papers (2022-06-20T07:19:38Z)
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms. Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward. Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z)
Multitask Adaptation by Retrospective Exploration with Learned World Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage. The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference [71.11416263370823]
We propose a generative inverse reinforcement learning for user behavioral preference modelling. Our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN.
arXiv Detail & Related papers (2021-05-03T13:14:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.