HetroD: A High-Fidelity Drone Dataset and Benchmark for Autonomous Driving in Heterogeneous Traffic
- URL: http://arxiv.org/abs/2602.03447v1
- Date: Tue, 03 Feb 2026 12:12:47 GMT
- Title: HetroD: A High-Fidelity Drone Dataset and Benchmark for Autonomous Driving in Heterogeneous Traffic
- Authors: Yu-Hsiang Chen, Wei-Jer Chang, Christian Kotulla, Thomas Keutgens, Steffen Runde, Tobias Moers, Christoph Klas, Wei Zhan, Masayoshi Tomizuka, Yi-Ting Chen,
- Abstract summary: HetroD is a dataset and benchmark for developing autonomous driving systems in heterogeneous environments.<n>HetroD targets the critical challenge of navi- gating real-world heterogeneous traffic dominated by vulner- able road users (VRUs)
- Score: 49.31491001465465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present HetroD, a dataset and benchmark for developing autonomous driving systems in heterogeneous environments. HetroD targets the critical challenge of navi- gating real-world heterogeneous traffic dominated by vulner- able road users (VRUs), including pedestrians, cyclists, and motorcyclists that interact with vehicles. These mixed agent types exhibit complex behaviors such as hook turns, lane splitting, and informal right-of-way negotiation. Such behaviors pose significant challenges for autonomous vehicles but remain underrepresented in existing datasets focused on structured, lane-disciplined traffic. To bridge the gap, we collect a large- scale drone-based dataset to provide a holistic observation of traffic scenes with centimeter-accurate annotations, HD maps, and traffic signal states. We further develop a modular toolkit for extracting per-agent scenarios to support downstream task development. In total, the dataset comprises over 65.4k high- fidelity agent trajectories, 70% of which are from VRUs. HetroD supports modeling of VRU behaviors in dense, het- erogeneous traffic and provides standardized benchmarks for forecasting, planning, and simulation tasks. Evaluation results reveal that state-of-the-art prediction and planning models struggle with the challenges presented by our dataset: they fail to predict lateral VRU movements, cannot handle unstructured maneuvers, and exhibit limited performance in dense and multi-agent scenarios, highlighting the need for more robust approaches to heterogeneous traffic. See our project page for more examples: https://hetroddata.github.io/HetroD/
Related papers
- Pedestrian Intention and Trajectory Prediction in Unstructured Traffic Using IDD-PeD [26.011293248078797]
We introduce an Indian driving pedestrian dataset designed to address the complexities of modeling pedestrian behavior in unstructured environments.<n>The dataset provides high-level and detailed low-level comprehensive annotations focused on pedestrians requiring the ego-vehicle's attention.
arXiv Detail & Related papers (2025-06-27T10:41:18Z) - OnSiteVRU: A High-Resolution Trajectory Dataset for High-Density Vulnerable Road Users [41.63444034391952]
This study developed the OnSiteVRU datasets, which cover a variety of scenarios, including intersections, road segments, and urban villages.<n>The datasets provide trajectory data for motor vehicles, electric bicycles, and human-powered bicycles, totaling approximately 17,429 trajectories with a precision of 0.04 seconds.<n>The results demonstrate that VRU_Data outperforms traditional datasets in terms of VRU density and scene coverage, offering a more comprehensive representation of VRU behavioral characteristics.
arXiv Detail & Related papers (2025-03-30T08:44:55Z) - Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks [23.098974945647683]
This paper presents an automated system that leverages trajectory forecasting to predict DZ events, specifically at traffic roundabouts.
Our system aims to enhance safety standards in both autonomous and manual transportation.
arXiv Detail & Related papers (2024-09-01T05:47:58Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Generative AI-empowered Simulation for Autonomous Driving in Vehicular
Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome.
Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly.
We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z) - Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers [126.81938540470847]
We propose Euro-PVI, a dataset of pedestrian and bicyclist trajectories.
In this work, we develop a joint inference model that learns an expressive multi-modal shared latent space across agents in the urban scene.
We achieve state of the art results on the nuScenes and Euro-PVI datasets demonstrating the importance of capturing interactions between ego-vehicle and pedestrians (bicyclists) for accurate predictions.
arXiv Detail & Related papers (2021-06-22T15:40:21Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Trajectron++: Dynamically-Feasible Trajectory Forecasting With
Heterogeneous Data [37.176411554794214]
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation.
We present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents.
We demonstrate its performance on several challenging real-world trajectory forecasting datasets.
arXiv Detail & Related papers (2020-01-09T16:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.