StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving
- URL: http://arxiv.org/abs/2506.23982v2
- Date: Sun, 03 Aug 2025 09:28:43 GMT
- Title: StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving
- Authors: Ruiyang Hao, Bowen Jing, Haibao Yu, Zaiqing Nie,
- Abstract summary: Personalization has been largely overlooked in the context of end-to-end autonomous driving (E2EAD)<n>We introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD.<n>We introduce the first standardized benchmark for systematically evaluating personalized E2EAD models.
- Score: 7.525510086747996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.
Related papers
- Pedestrian Intention Prediction via Vision-Language Foundation Models [10.351342371371675]
This study explores the potential of vision-language foundation models (VLFMs) for predicting pedestrian crossing intentions.<n>The methodology incorporates contextual information, including visual frames, physical cues observations, and ego-vehicle dynamics, into systematically refined prompts.<n>Results demonstrate that incorporating vehicle speed, its variations over time, and time-conscious prompts significantly enhances the prediction accuracy up to 19.8%.
arXiv Detail & Related papers (2025-07-05T19:39:00Z) - Towards Human-Like Trajectory Prediction for Autonomous Driving: A Behavior-Centric Approach [22.81464823797471]
HiT (Human-like Trajectory Prediction) is a novel model designed to enhance trajectory prediction by incorporating behavior-aware modules and dynamic centrality measures.<n>To evaluate HiT's performance, we conducted extensive experiments using diverse and challenging real-world datasets.
arXiv Detail & Related papers (2025-05-27T05:04:01Z) - SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving [51.47621083057114]
SOLVE is an innovative framework that synergizes Vision-Language Models with end-to-end (E2E) models to enhance autonomous vehicle planning.<n>Our approach emphasizes knowledge sharing at the feature level through a shared visual encoder, enabling comprehensive interaction between VLM and E2E components.
arXiv Detail & Related papers (2025-05-22T15:44:30Z) - A Knowledge-Informed Deep Learning Paradigm for Generalizable and Stability-Optimized Car-Following Models [15.34704164931383]
Car-following models (CFMs) are fundamental to traffic flow analysis and autonomous driving.<n>We propose a Knowledge-Informed Deep Learning (KIDL) paradigm that distills the generalization capabilities of pre-trained Large Language Models (LLMs) into a lightweight and stability-aware neural architecture.<n>We evaluate KIDL on the real-world NGSIM and HighD datasets, comparing its performance with representative physics-based, data-driven, and hybrid CFMs.
arXiv Detail & Related papers (2025-04-19T09:33:02Z) - GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving [12.889523014369884]
We propose a geometric and semantic self-supervised pre-training method, GASP, that learns a unified representation by predicting, at any queried future point in spacetime.<n>By modeling geometric and semantic 4D occupancy fields instead of raw sensor measurements, the model learns a structured, general representation of the environment and its evolution through time.
arXiv Detail & Related papers (2025-03-19T20:00:27Z) - A Survey of World Models for Autonomous Driving [63.33363128964687]
Recent breakthroughs in autonomous driving have been propelled by advances in robust world modeling.<n>World models offer high-fidelity representations of the driving environment that integrate multi-sensor data, semantic cues, and temporal dynamics.<n>This paper systematically reviews recent advances in world models for autonomous driving.
arXiv Detail & Related papers (2025-01-20T04:00:02Z) - DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower.
We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events.
We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z) - Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment [104.18002641195442]
We introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing paired data.
Building on the self-play concept, which autonomously generates negative responses, we further incorporate an off-policy learning pipeline to enhance data exploration and exploitation.
arXiv Detail & Related papers (2024-05-31T14:21:04Z) - D2E-An Autonomous Decision-making Dataset involving Driver States and Human Evaluation [6.890077875318333]
Driver to Evaluation dataset (D2E) is an autonomous decision-making dataset.
It contains data on driver states, vehicle states, environmental situations, and evaluation scores from human reviewers.
D2E contains over 1100 segments of interactive driving case data covering from human driver factor to evaluation results.
arXiv Detail & Related papers (2024-04-12T21:29:18Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous
Driving [24.123577277806135]
We pioneer a novel behavior-aware trajectory prediction model (BAT)
Our model consists of behavior-aware, interaction-aware, priority-aware, and position-aware modules.
We evaluate BAT's performance across the Next Generation Simulation (NGSIM), Highway Drone (HighD), Roundabout Drone (RounD), and Macao Connected Autonomous Driving (MoCAD) datasets.
arXiv Detail & Related papers (2023-12-11T13:27:51Z) - Interaction-Aware Personalized Vehicle Trajectory Prediction Using
Temporal Graph Neural Networks [8.209194305630229]
Existing methods mainly rely on generic trajectory predictions from large datasets.
We propose an approach for interaction-aware personalized vehicle trajectory prediction that incorporates temporal graph neural networks.
arXiv Detail & Related papers (2023-08-14T20:20:26Z) - Policy Pre-training for End-to-end Autonomous Driving via
Self-supervised Geometric Modeling [96.31941517446859]
We propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving.
We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos.
In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
arXiv Detail & Related papers (2023-01-03T08:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.