Bi3D: Stereo Depth Estimation via Binary Classifications
- URL: http://arxiv.org/abs/2005.07274v2
- Date: Mon, 1 Jun 2020 16:44:34 GMT
- Title: Bi3D: Stereo Depth Estimation via Binary Classifications
- Authors: Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep
Sen, Orazio Gallo
- Abstract summary: We present Bi3D, a method that estimates depth via a series of binary classifications.
Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds.
For standard stereo (i.e., continuous depth on the whole range), our method is close to or on par with state-of-the-art, finely tuned stereo methods.
- Score: 129.1582262491508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stereo-based depth estimation is a cornerstone of computer vision, with
state-of-the-art methods delivering accurate results in real time. For several
applications such as autonomous navigation, however, it may be useful to trade
accuracy for lower latency. We present Bi3D, a method that estimates depth via
a series of binary classifications. Rather than testing if objects are at a
particular depth $D$, as existing stereo methods do, it classifies them as
being closer or farther than $D$. This property offers a powerful mechanism to
balance accuracy and latency. Given a strict time budget, Bi3D can detect
objects closer than a given distance in as little as a few milliseconds, or
estimate depth with arbitrarily coarse quantization, with complexity linear
with the number of quantization levels. Bi3D can also use the allotted
quantization levels to get continuous depth, but in a specific depth range. For
standard stereo (i.e., continuous depth on the whole range), our method is
close to or on par with state-of-the-art, finely tuned stereo methods.
Related papers
- MonoCD: Monocular 3D Object Detection with Complementary Depths [9.186673054867866]
Depth estimation is an essential but challenging subtask of monocular 3D object detection.
We propose to increase the complementarity of depths with two novel designs.
Experiments on the KITTI benchmark demonstrate that our method achieves state-of-the-art performance without introducing extra data.
arXiv Detail & Related papers (2024-04-04T03:30:49Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - 360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse
Network [35.03201732370496]
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction.
In this work, we first establish a large-scale dataset with varied settings called Depth360 to tackle the training data problem.
We then propose an end-to-end two-branch multi-task learning network, SegFuse, that mimics the human eye to effectively learn from the dataset.
arXiv Detail & Related papers (2022-02-16T11:56:31Z) - Depth Refinement for Improved Stereo Reconstruction [13.941756438712382]
Current techniques for depth estimation from stereoscopic images still suffer from a built-in drawback.
A simple analysis reveals that the depth error is quadratically proportional to the object's distance.
We propose a simple but effective method that uses a refinement network for depth estimation.
arXiv Detail & Related papers (2021-12-15T12:21:08Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - Stereo Object Matching Network [78.35697025102334]
This paper presents a stereo object matching method that exploits both 2D contextual information from images and 3D object-level information.
We present two novel strategies to handle 3D objectness in the cost volume space: selective sampling (RoISelect) and 2D-3D fusion.
arXiv Detail & Related papers (2021-03-23T12:54:43Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Center3D: Center-based Monocular 3D Object Detection with Joint Depth
Understanding [3.4806267677524896]
Center3D is a one-stage anchor-free approach to efficiently estimate 3D location and depth.
By exploiting the difference between 2D and 3D centers, we are able to estimate depth consistently.
Compared with state-of-the-art detectors, Center3D has achieved the best speed-accuracy trade-off in realtime monocular object detection.
arXiv Detail & Related papers (2020-05-27T15:29:09Z) - DELTAS: Depth Estimation by Learning Triangulation And densification of
Sparse points [14.254472131009653]
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation.
Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems.
We propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs.
arXiv Detail & Related papers (2020-03-19T17:56:41Z) - Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [92.84498980104424]
We put three different types of depth estimation into a common framework.
Our method produces a time series of depth maps.
It can be applied to monocular videos only or be combined with different types of sparse depth patterns.
arXiv Detail & Related papers (2020-01-08T16:50:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.