Accurate and Real-time Pseudo Lidar Detection: Is Stereo Neural Network
Really Necessary?
- URL: http://arxiv.org/abs/2206.13858v1
- Date: Tue, 28 Jun 2022 09:53:00 GMT
- Title: Accurate and Real-time Pseudo Lidar Detection: Is Stereo Neural Network
Really Necessary?
- Authors: Haitao Meng, Changcai Li, Gang Chen and Alois Knoll
- Abstract summary: We develop a system with a less powerful stereo matching predictor and adopt the proposed refinement schemes to improve the accuracy.
The presented system achieves competitive accuracy to the state-of-the-art approaches with only 23 ms computing, showing it is a suitable candidate for deploying to real car-hold applications.
- Score: 6.8067583993953775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The proposal of Pseudo-Lidar representation has significantly narrowed the
gap between visual-based and active Lidar-based 3D object detection. However,
current researches exclusively focus on pushing the accuracy improvement of
Pseudo-Lidar by taking the advantage of complex and time-consuming neural
networks. Seldom explore the profound characteristics of Pseudo-Lidar
representation to obtain the promoting opportunities. In this paper, we dive
deep into the pseudo Lidar representation and argue that the performance of 3D
object detection is not fully dependent on the high precision stereo depth
estimation. We demonstrate that even for the unreliable depth estimation, with
proper data processing and refining, it can achieve comparable 3D object
detection accuracy. With this finding, we further show the possibility that
utilizing fast but inaccurate stereo matching algorithms in the Pseudo-Lidar
system to achieve low latency responsiveness. In the experiments, we develop a
system with a less powerful stereo matching predictor and adopt the proposed
refinement schemes to improve the accuracy. The evaluation on the KITTI
benchmark shows that the presented system achieves competitive accuracy to the
state-of-the-art approaches with only 23 ms computing, showing it is a suitable
candidate for deploying to real car-hold applications.
Related papers
- SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning.
SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z) - Rethinking Voxelization and Classification for 3D Object Detection [68.8204255655161]
The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network.
We present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer.
In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects.
arXiv Detail & Related papers (2023-01-10T16:22:04Z) - 3D Harmonic Loss: Towards Task-consistent and Time-friendly 3D Object
Detection on Edge for Intelligent Transportation System [28.55894241049706]
We propose a 3D harmonic loss function to relieve the pointcloud based inconsistent predictions.
Our proposed method considerably improves the performance than benchmark models.
Our code is open-source and publicly available.
arXiv Detail & Related papers (2022-11-07T10:11:48Z) - Self-Configurable Stabilized Real-Time Detection Learning for Autonomous
Driving Applications [15.689145350449737]
We improve the performance of an object detection neural network utilizing optical flow estimation.
It adaptively determines whether to use optical flow to suit the dynamic vehicle environment.
In the demonstration, our proposed framework improves the accuracy by 3.02%, the number of detected objects by 59.6%, and the queue stability for computing.
arXiv Detail & Related papers (2022-09-29T03:11:33Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences.
We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction.
Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off.
We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion.
Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - A Smooth Representation of Belief over SO(3) for Deep Rotation Learning
with Uncertainty [33.627068152037815]
We present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models.
We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities.
This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models.
arXiv Detail & Related papers (2020-06-01T15:57:45Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.