Related papers: Bi3D: Stereo Depth Estimation via Binary Classifications

Bi3D: Stereo Depth Estimation via Binary Classifications

URL: http://arxiv.org/abs/2005.07274v2
Date: Mon, 1 Jun 2020 16:44:34 GMT
Title: Bi3D: Stereo Depth Estimation via Binary Classifications
Authors: Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, Orazio Gallo
Abstract summary: We present Bi3D, a method that estimates depth via a series of binary classifications. Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds. For standard stereo (i.e., continuous depth on the whole range), our method is close to or on par with state-of-the-art, finely tuned stereo methods.
Score: 129.1582262491508
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stereo-based depth estimation is a cornerstone of computer vision, with state-of-the-art methods delivering accurate results in real time. For several applications such as autonomous navigation, however, it may be useful to trade accuracy for lower latency. We present Bi3D, a method that estimates depth via a series of binary classifications. Rather than testing if objects are at a particular depth $D$, as existing stereo methods do, it classifies them as being closer or farther than $D$. This property offers a powerful mechanism to balance accuracy and latency. Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds, or estimate depth with arbitrarily coarse quantization, with complexity linear with the number of quantization levels. Bi3D can also use the allotted quantization levels to get continuous depth, but in a specific depth range. For standard stereo (i.e., continuous depth on the whole range), our method is close to or on par with state-of-the-art, finely tuned stereo methods.

Related papers

Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation [83.841877607646]
We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation. The dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
arXiv Detail & Related papers (2024-11-27T13:34:41Z)
MonoCD: Monocular 3D Object Detection with Complementary Depths [9.186673054867866]
Depth estimation is an essential but challenging subtask of monocular 3D object detection. We propose to increase the complementarity of depths with two novel designs. Experiments on the KITTI benchmark demonstrate that our method achieves state-of-the-art performance without introducing extra data.
arXiv Detail & Related papers (2024-04-04T03:30:49Z)
Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection. Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon. Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z)
360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse Network [35.03201732370496]
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction. In this work, we first establish a large-scale dataset with varied settings called Depth360 to tackle the training data problem. We then propose an end-to-end two-branch multi-task learning network, SegFuse, that mimics the human eye to effectively learn from the dataset.
arXiv Detail & Related papers (2022-02-16T11:56:31Z)
Depth Refinement for Improved Stereo Reconstruction [13.941756438712382]
Current techniques for depth estimation from stereoscopic images still suffer from a built-in drawback. A simple analysis reveals that the depth error is quadratically proportional to the object's distance. We propose a simple but effective method that uses a refinement network for depth estimation.
arXiv Detail & Related papers (2021-12-15T12:21:08Z)
M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention. The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z)
Stereo Object Matching Network [78.35697025102334]
This paper presents a stereo object matching method that exploits both 2D contextual information from images and 3D object-level information. We present two novel strategies to handle 3D objectness in the cost volume space: selective sampling (RoISelect) and 2D-3D fusion.
arXiv Detail & Related papers (2021-03-23T12:54:43Z)
PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space. We propose a model that unifies these two tasks in the same metric space. Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
Center3D: Center-based Monocular 3D Object Detection with Joint Depth Understanding [3.4806267677524896]
Center3D is a one-stage anchor-free approach to efficiently estimate 3D location and depth. By exploiting the difference between 2D and 3D centers, we are able to estimate depth consistently. Compared with state-of-the-art detectors, Center3D has achieved the best speed-accuracy trade-off in realtime monocular object detection.
arXiv Detail & Related papers (2020-05-27T15:29:09Z)
DELTAS: Depth Estimation by Learning Triangulation And densification of Sparse points [14.254472131009653]
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. We propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs.
arXiv Detail & Related papers (2020-03-19T17:56:41Z)
Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [92.84498980104424]
We put three different types of depth estimation into a common framework. Our method produces a time series of depth maps. It can be applied to monocular videos only or be combined with different types of sparse depth patterns.
arXiv Detail & Related papers (2020-01-08T16:50:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.