RV-FuseNet: Range View Based Fusion of Time-Series LiDAR Data for Joint
3D Object Detection and Motion Forecasting
- URL: http://arxiv.org/abs/2005.10863v3
- Date: Tue, 23 Mar 2021 02:43:33 GMT
- Title: RV-FuseNet: Range View Based Fusion of Time-Series LiDAR Data for Joint
3D Object Detection and Motion Forecasting
- Authors: Ankit Laddha, Shivam Gautam, Gregory P. Meyer, Carlos
Vallespi-Gonzalez, Carl K. Wellington
- Abstract summary: We present RV-FuseNet, a novel end-to-end approach for joint detection and trajectory estimation.
Instead of the widely used bird's eye view (BEV) representation, we utilize the native range view (RV) representation of LiDAR data.
We show that our approach significantly improves motion forecasting performance over the existing state-of-the-art.
- Score: 13.544498422625448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust real-time detection and motion forecasting of traffic participants is
necessary for autonomous vehicles to safely navigate urban environments. In
this paper, we present RV-FuseNet, a novel end-to-end approach for joint
detection and trajectory estimation directly from time-series LiDAR data.
Instead of the widely used bird's eye view (BEV) representation, we utilize the
native range view (RV) representation of LiDAR data. The RV preserves the full
resolution of the sensor by avoiding the voxelization used in the BEV.
Furthermore, RV can be processed efficiently due to its compactness. Previous
approaches project time-series data to a common viewpoint for temporal fusion,
and often this viewpoint is different from where it was captured. This is
sufficient for BEV methods, but for RV methods, this can lead to loss of
information and data distortion which has an adverse impact on performance. To
address this challenge we propose a simple yet effective novel architecture,
\textit{Incremental Fusion}, that minimizes the information loss by
sequentially projecting each RV sweep into the viewpoint of the next sweep in
time. We show that our approach significantly improves motion forecasting
performance over the existing state-of-the-art. Furthermore, we demonstrate
that our sequential fusion approach is superior to alternative RV based fusion
methods on multiple datasets.
Related papers
- Streamlining Forest Wildfire Surveillance: AI-Enhanced UAVs Utilizing the FLAME Aerial Video Dataset for Lightweight and Efficient Monitoring [4.303063757163241]
This study recognizes the imperative for real-time data processing in disaster response scenarios and introduces a lightweight and efficient approach for aerial video understanding.
Our methodology identifies redundant portions within the video through policy networks and eliminates this excess information using frame compression techniques.
Compared to the baseline, our approach reduces computation costs by more than 13 times while boosting accuracy by 3$%$.
arXiv Detail & Related papers (2024-08-31T17:26:53Z) - BEVCar: Camera-Radar Fusion for BEV Map and Object Segmentation [22.870994478494566]
We introduce BEVCar, a novel approach for joint BEV object and map segmentation.
The core novelty of our approach lies in first learning a point-based encoding of raw radar data.
We show that incorporating radar information significantly enhances robustness in challenging environmental conditions.
arXiv Detail & Related papers (2024-03-18T13:14:46Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - A Novel Deep Neural Network for Trajectory Prediction in Automated
Vehicles Using Velocity Vector Field [12.067838086415833]
This paper proposes a novel technique for trajectory prediction that combines a data-driven learning-based method with a velocity vector field (VVF) generated from a nature-inspired concept.
The accuracy remains consistent with decreasing observation windows which alleviates the requirement of a long history of past observations for accurate trajectory prediction.
arXiv Detail & Related papers (2023-09-19T22:14:52Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes.
We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z) - Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car.
We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting
through Multi-View Fusion of LiDAR Data [4.8061970432391785]
We propose itMVFusenet, a novel end-to-end method for joint object detection motion forecasting from a temporal sequence of LiDAR data.
We show the benefits of our multi-view approach for the tasks of detection and motion forecasting on two large-scale self-driving data sets.
arXiv Detail & Related papers (2021-04-21T21:29:08Z) - Multi-View Fusion of Sensor Data for Improved Perception and Prediction
in Autonomous Driving [11.312620949473938]
We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR and camera images.
Our model builds on a state-of-the-art Bird's-Eye View (BEV) network that fuses voxelized features from a sequence of historical LiDAR data.
We extend this model with additional LiDAR Range-View (RV) features that use the raw LiDAR information in its native, non-quantized representation.
arXiv Detail & Related papers (2020-08-27T03:32:25Z) - Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep
Reinforcement Learning Approach [88.45509934702913]
We design a navigation policy for multiple unmanned aerial vehicles (UAVs) where mobile base stations (BSs) are deployed.
We incorporate different contextual information such as energy and age of information (AoI) constraints to ensure the data freshness at the ground BS.
By applying the proposed trained model, an effective real-time trajectory policy for the UAV-BSs captures the observable network states over time.
arXiv Detail & Related papers (2020-02-21T07:29:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.