Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL
- URL: http://arxiv.org/abs/2507.19146v1
- Date: Fri, 25 Jul 2025 10:35:30 GMT
- Title: Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL
- Authors: Ahmed Abouelazm, Johannes Ratz, Philip Schörner, J. Marius Zöllner,
- Abstract summary: This work introduces a novel student-teacher framework for automatic curriculum learning.<n>The teacher, a graph-based multi-agent RL component, adaptively generates traffic behaviors across diverse difficulty levels.<n>Results demonstrate the teacher's ability to generate diverse traffic behaviors.
- Score: 11.198097218885191
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous driving faces challenges in navigating complex real-world traffic, requiring safe handling of both common and critical scenarios. Reinforcement learning (RL), a prominent method in end-to-end driving, enables agents to learn through trial and error in simulation. However, RL training often relies on rule-based traffic scenarios, limiting generalization. Additionally, current scenario generation methods focus heavily on critical scenarios, neglecting a balance with routine driving behaviors. Curriculum learning, which progressively trains agents on increasingly complex tasks, is a promising approach to improving the robustness and coverage of RL driving policies. However, existing research mainly emphasizes manually designed curricula, focusing on scenery and actor placement rather than traffic behavior dynamics. This work introduces a novel student-teacher framework for automatic curriculum learning. The teacher, a graph-based multi-agent RL component, adaptively generates traffic behaviors across diverse difficulty levels. An adaptive mechanism adjusts task difficulty based on student performance, ensuring exposure to behaviors ranging from common to critical. The student, though exchangeable, is realized as a deep RL agent with partial observability, reflecting real-world perception constraints. Results demonstrate the teacher's ability to generate diverse traffic behaviors. The student, trained with automatic curricula, outperformed agents trained on rule-based traffic, achieving higher rewards and exhibiting balanced, assertive driving.
Related papers
- Automatic Curriculum Learning for Driving Scenarios: Towards Robust and Efficient Reinforcement Learning [11.602831593017427]
This paper addresses the challenges of training end-to-end autonomous driving agents using Reinforcement Learning (RL)<n>RL agents are typically trained in a fixed set of scenarios and nominal behavior of surrounding road users in simulations.<n>We propose an automatic curriculum learning framework that dynamically generates driving scenarios with adaptive complexity based on the agent's evolving capabilities.
arXiv Detail & Related papers (2025-05-13T06:26:57Z) - TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.<n>A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z) - CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving [1.188383832081829]
Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards.<n>We propose a method that combines DRL with Curriculum Learning for autonomous driving.
arXiv Detail & Related papers (2025-01-09T05:45:03Z) - GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching [81.82487256783674]
GARLIC: a framework of GPT-Augmented Reinforcement Learning with Intelligent Control for vehicle dispatching.<n>This paper introduces GARLIC: a framework of GPT-Augmented Reinforcement Learning with Intelligent Control for vehicle dispatching.
arXiv Detail & Related papers (2024-08-19T08:23:38Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z) - Driver Dojo: A Benchmark for Generalizable Reinforcement Learning for
Autonomous Driving [1.496194593196997]
We propose a benchmark for generalizable reinforcement learning for autonomous driving.
Our application-oriented benchmark enables a better understanding of the impact of design decisions.
Our benchmark aims to encourage researchers to propose solutions that are able to successfully generalize across scenarios.
arXiv Detail & Related papers (2022-07-23T06:29:43Z) - Learning energy-efficient driving behaviors by imitating experts [75.12960180185105]
This paper examines the role of imitation learning in bridging the gap between control strategies and realistic limitations in communication and sensing.
We show that imitation learning can succeed in deriving policies that, if adopted by 5% of vehicles, may boost the energy-efficiency of networks with varying traffic conditions by 15% using only local observations.
arXiv Detail & Related papers (2022-06-28T17:08:31Z) - Building Safer Autonomous Agents by Leveraging Risky Driving Behavior
Knowledge [1.52292571922932]
This study focuses on creating risk prone scenarios with heavy traffic and unexpected random behavior for creating better model-free learning agents.
We generate multiple autonomous driving scenarios by creating new custom Markov Decision Process (MDP) environment iterations in highway-env simulation package.
We train model free learning agents with supplement information of risk prone driving scenarios and compare their performance with baseline agents.
arXiv Detail & Related papers (2021-03-16T23:39:33Z) - Investigating Value of Curriculum Reinforcement Learning in Autonomous
Driving Under Diverse Road and Weather Conditions [0.0]
This paper focuses on investigating the value of curriculum reinforcement learning in autonomous driving applications.
We setup several different driving scenarios in a realistic driving simulator, with varying road complexity and weather conditions.
Results show that curriculum RL can yield significant gains in complex driving tasks, both in terms of driving performance and sample complexity.
arXiv Detail & Related papers (2021-03-14T12:05:05Z) - MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control [54.162449208797334]
Traffic signal control aims to coordinate traffic signals across intersections to improve the traffic efficiency of a district or a city.
Deep reinforcement learning (RL) has been applied to traffic signal control recently and demonstrated promising performance where each traffic signal is regarded as an agent.
We propose a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
arXiv Detail & Related papers (2021-01-04T03:06:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.