BikeActions: An Open Platform and Benchmark for Cyclist-Centric VRU Action Recognition
- URL: http://arxiv.org/abs/2601.10521v2
- Date: Tue, 20 Jan 2026 12:44:36 GMT
- Title: BikeActions: An Open Platform and Benchmark for Cyclist-Centric VRU Action Recognition
- Authors: Max A. Buettner, Kanak Mazumder, Luca Koecher, Mario Finkbeiner, Sebastian Niebler, Fabian B. Flohr,
- Abstract summary: FUSE-Bike is the first fully open perception platform of its kind.<n>BikeActions is a novel multi-modal dataset comprising 852 annotated samples across 5 distinct action classes.<n>We establish a rigorous benchmark by evaluating state-of-the-art graph convolution and transformer-based models on our publicly released data splits.
- Score: 0.2339805471804333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anticipating the intentions of Vulnerable Road Users (VRUs) is a critical challenge for safe autonomous driving (AD) and mobile robotics. While current research predominantly focuses on pedestrian crossing behaviors from a vehicle's perspective, interactions within dense shared spaces remain underexplored. To bridge this gap, we introduce FUSE-Bike, the first fully open perception platform of its kind. Equipped with two LiDARs, a camera, and GNSS, it facilitates high-fidelity, close-range data capture directly from a cyclist's viewpoint. Leveraging this platform, we present BikeActions, a novel multi-modal dataset comprising 852 annotated samples across 5 distinct action classes, specifically tailored to improve VRU behavior modeling. We establish a rigorous benchmark by evaluating state-of-the-art graph convolution and transformer-based models on our publicly released data splits, establishing the first performance baselines for this challenging task. We release the full dataset together with data curation tools, the open hardware design, and the benchmark code to foster future research in VRU action understanding under https://iv.ee.hm.edu/bikeactions/.
Related papers
- HetroD: A High-Fidelity Drone Dataset and Benchmark for Autonomous Driving in Heterogeneous Traffic [49.31491001465465]
HetroD is a dataset and benchmark for developing autonomous driving systems in heterogeneous environments.<n>HetroD targets the critical challenge of navi- gating real-world heterogeneous traffic dominated by vulner- able road users (VRUs)
arXiv Detail & Related papers (2026-02-03T12:12:47Z) - Conformal Trajectory Prediction with Multi-View Data Integration in Cooperative Driving [4.628774934971078]
Current research on trajectory prediction primarily relies on data collected by onboard sensors of an ego vehicle.<n>We introduce V2INet, a novel trajectory prediction framework designed to model multi-view data by extending existing single-view models.<n>Our results demonstrate superior performance in terms of Final Displacement Error (FDE) and Miss Rate (MR) using a single GPU.
arXiv Detail & Related papers (2024-08-01T08:32:03Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - A Benchmark for Cycling Close Pass Detection from Video Streams [31.962089421160055]
We introduce a novel benchmark, called Cyc-CP, towards close pass (CP) event detection from video streams.<n>Scene-level detection ascertains the presence of a CP event within the provided video clip.<n> Instance-level detection identifies the specific vehicle within the scene that precipitates a CP event.
arXiv Detail & Related papers (2023-04-24T07:30:01Z) - DeepAccident: A Motion and Accident Prediction Benchmark for V2X
Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving.
The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z) - V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous
Driving [26.961213523096948]
Vehicle-to-everything (V2X) denotes the collaboration between a vehicle and any entity in its surrounding.
We present the V2X-Sim dataset, the first public large-scale collaborative perception dataset in autonomous driving.
arXiv Detail & Related papers (2022-02-17T05:14:02Z) - OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with
Vehicle-to-Vehicle Communication [13.633468133727]
We present the first large-scale open simulated dataset for Vehicle-to-Vehicle perception.
It contains over 70 interesting scenes, 11,464 frames, and 232,913 annotated 3D vehicle bounding boxes.
arXiv Detail & Related papers (2021-09-16T00:52:41Z) - Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers [126.81938540470847]
We propose Euro-PVI, a dataset of pedestrian and bicyclist trajectories.
In this work, we develop a joint inference model that learns an expressive multi-modal shared latent space across agents in the urban scene.
We achieve state of the art results on the nuScenes and Euro-PVI datasets demonstrating the importance of capturing interactions between ego-vehicle and pedestrians (bicyclists) for accurate predictions.
arXiv Detail & Related papers (2021-06-22T15:40:21Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - VehicleNet: Learning Robust Visual Representation for Vehicle
Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets.
We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet.
We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.