V2X-ReaLO: An Open Online Framework and Dataset for Cooperative Perception in Reality
- URL: http://arxiv.org/abs/2503.10034v1
- Date: Thu, 13 Mar 2025 04:31:20 GMT
- Title: V2X-ReaLO: An Open Online Framework and Dataset for Cooperative Perception in Reality
- Authors: Hao Xiang, Zhaoliang Zheng, Xin Xia, Seth Z. Zhao, Letian Gao, Zewei Zhou, Tianhui Cai, Yun Zhang, Jiaqi Ma,
- Abstract summary: We introduce V2X-ReaLO, an open online cooperative perception framework deployed on real vehicles and smart infrastructure.<n>We present an open benchmark dataset designed to assess the performance of online cooperative perception systems.
- Score: 13.68645389910716
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Cooperative perception enabled by Vehicle-to-Everything (V2X) communication holds significant promise for enhancing the perception capabilities of autonomous vehicles, allowing them to overcome occlusions and extend their field of view. However, existing research predominantly relies on simulated environments or static datasets, leaving the feasibility and effectiveness of V2X cooperative perception especially for intermediate fusion in real-world scenarios largely unexplored. In this work, we introduce V2X-ReaLO, an open online cooperative perception framework deployed on real vehicles and smart infrastructure that integrates early, late, and intermediate fusion methods within a unified pipeline and provides the first practical demonstration of online intermediate fusion's feasibility and performance under genuine real-world conditions. Additionally, we present an open benchmark dataset specifically designed to assess the performance of online cooperative perception systems. This new dataset extends V2X-Real dataset to dynamic, synchronized ROS bags and provides 25,028 test frames with 6,850 annotated key frames in challenging urban scenarios. By enabling real-time assessments of perception accuracy and communication lantency under dynamic conditions, V2X-ReaLO sets a new benchmark for advancing and optimizing cooperative perception systems in real-world applications. The codes and datasets will be released to further advance the field.
Related papers
- V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction [43.506717060709136]
Vehicle-to-everything (V2X) technologies offer a promising paradigm to mitigate the limitations of constrained observability in single-vehicle systems.<n>Prior work primarily focuses on single-frame cooperative perception.<n>In this paper, we focus on temporal perception and prediction tasks in V2X scenarios and one-step and multi-temporal communication strategies.
arXiv Detail & Related papers (2024-12-02T18:55:34Z) - Conformal Trajectory Prediction with Multi-View Data Integration in Cooperative Driving [4.628774934971078]
Current research on trajectory prediction primarily relies on data collected by onboard sensors of an ego vehicle.
We introduce V2INet, a novel trajectory prediction framework designed to model multi-view data by extending existing single-view models.
Our results demonstrate superior performance in terms of Final Displacement Error (FDE) and Miss Rate (MR) using a single GPU.
arXiv Detail & Related papers (2024-08-01T08:32:03Z) - End-to-End Autonomous Driving through V2X Cooperation [23.44597411612664]
We introduce UniV2X, a pioneering cooperative autonomous driving framework.<n>UniV2X seamlessly integrates all key driving modules across diverse views into a unified network.
arXiv Detail & Related papers (2024-03-31T15:22:11Z) - V2X-Real: a Large-Scale Dataset for Vehicle-to-Everything Cooperative Perception [22.3955949838171]
We present V2X-Real, a large-scale dataset that includes a mixture of multiple vehicles and smart infrastructure.
Our dataset contains 33K LiDAR frames and 171K camera data with over 1.2M annotated bounding boxes of 10 categories in very challenging urban scenarios.
arXiv Detail & Related papers (2024-03-24T06:30:02Z) - FSD V2: Improving Fully Sparse 3D Object Detection with Virtual Voxels [57.05834683261658]
We present FSDv2, an evolution that aims to simplify the previous FSDv1 while eliminating the inductive bias introduced by its handcrafted instance-level representation.
We develop a suite of components to complement the virtual voxel concept, including a virtual voxel encoder, a virtual voxel mixer, and a virtual voxel assignment strategy.
arXiv Detail & Related papers (2023-08-07T17:59:48Z) - A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild [72.0226493284814]
We propose a unified framework for event-based frame that performs deblurring ad-hoc.<n>Our network consistently outperforms previous state-of-the-art methods on frame, single image deblurring, and the joint task of both.
arXiv Detail & Related papers (2023-01-12T18:19:00Z) - V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
Transformer [58.71845618090022]
We build a holistic attention model, namely V2X-ViT, to fuse information across on-road agents.
V2X-ViT consists of alternating layers of heterogeneous multi-agent self-attention and multi-scale window self-attention.
To validate our approach, we create a large-scale V2X perception dataset.
arXiv Detail & Related papers (2022-03-20T20:18:25Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous
Driving [26.961213523096948]
Vehicle-to-everything (V2X) denotes the collaboration between a vehicle and any entity in its surrounding.
We present the V2X-Sim dataset, the first public large-scale collaborative perception dataset in autonomous driving.
arXiv Detail & Related papers (2022-02-17T05:14:02Z) - Value Function is All You Need: A Unified Learning Framework for Ride
Hailing Platforms [57.21078336887961]
Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day.
We propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks.
arXiv Detail & Related papers (2021-05-18T19:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.