StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection
- URL: http://arxiv.org/abs/2407.03825v1
- Date: Thu, 4 Jul 2024 10:56:10 GMT
- Title: StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection
- Authors: Yunshuang Yuan, Monika Sester,
- Abstract summary: We propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X.
Experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models.
- Score: 0.552480439325792
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cooperative perception via communication among intelligent traffic agents has great potential to improve the safety of autonomous driving. However, limited communication bandwidth, localization errors and asynchronized capturing time of sensor data, all introduce difficulties to the data fusion of different agents. To some extend, previous works have attempted to reduce the shared data size, mitigate the spatial feature misalignment caused by localization errors and communication delay. However, none of them have considered the asynchronized sensor ticking times, which can lead to dynamic object misplacement of more than one meter during data fusion. In this work, we propose Time-Aligned COoperative Object Detection (TA-COOD), for which we adapt widely used dataset OPV2V and DairV2X with considering asynchronous LiDAR sensor ticking times and build an efficient fully sparse framework with modeling the temporal information of individual objects with query-based techniques. The experiment results confirmed the superior efficiency of our fully sparse framework compared to the state-of-the-art dense models. More importantly, they show that the point-wise observation timestamps of the dynamic objects are crucial for accurate modeling the object temporal context and the predictability of their time-related locations.
Related papers
- CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions [13.981748780317329]
Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs)
This study introduces a novel accident anticipation framework for AVs, termed CRASH.
It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion.
Our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-25T04:12:49Z) - Robust Collaborative Perception without External Localization and Clock Devices [52.32342059286222]
A consistent spatial-temporal coordination across multiple agents is fundamental for collaborative perception.
Traditional methods depend on external devices to provide localization and clock signals.
We propose a novel approach: aligning by recognizing the inherent geometric patterns within the perceptual data of various agents.
arXiv Detail & Related papers (2024-05-05T15:20:36Z) - Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and
Novel Outliers Detection [1.0334138809056097]
We propose a novel framework for real-time LiDAR odometry and mapping based on LOAM architecture for fast moving platforms.
Our framework utilizes semantic information produced by a deep learning model to improve point-to-line and point-to-plane matching.
We study the effect of improving the matching process on the robustness of LiDAR odometry against high speed motion.
arXiv Detail & Related papers (2024-03-05T16:53:24Z) - TimePillars: Temporally-Recurrent 3D LiDAR Object Detection [8.955064958311517]
TimePillars is a temporally-recurrent object detection pipeline.
It exploits the pillar representation of LiDAR data across time.
We show how basic building blocks are enough to achieve robust and efficient results.
arXiv Detail & Related papers (2023-12-22T10:25:27Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and
Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.
We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis.
We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - DetectorNet: Transformer-enhanced Spatial Temporal Graph Neural Network
for Traffic Prediction [4.302265301004301]
Detectors with high coverage have direct and far-reaching benefits for road users in route planning and avoiding traffic congestion.
utilizing these data presents unique challenges including: the dynamic temporal correlation, and the dynamic spatial correlation caused by changes in road conditions.
We propose DetectorNet enhanced by Transformer to address these challenges.
arXiv Detail & Related papers (2021-10-19T03:47:38Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.