Does depth estimation help object detection?
- URL: http://arxiv.org/abs/2204.06512v1
- Date: Wed, 13 Apr 2022 17:03:25 GMT
- Title: Does depth estimation help object detection?
- Authors: Bedrettin Cetinkaya, Sinan Kalkan, Emre Akbas
- Abstract summary: Many factors affect the performance of object detection when estimated depth is used.
We propose an early concatenation strategy of depth, which yields higher mAP than previous works' while using significantly fewer parameters.
- Score: 16.904673709059622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ground-truth depth, when combined with color data, helps improve object
detection accuracy over baseline models that only use color. However, estimated
depth does not always yield improvements. Many factors affect the performance
of object detection when estimated depth is used. In this paper, we
comprehensively investigate these factors with detailed experiments, such as
using ground-truth vs. estimated depth, effects of different state-of-the-art
depth estimation networks, effects of using different indoor and outdoor RGB-D
datasets as training data for depth estimation, and different architectural
choices for integrating depth to the base object detector network. We propose
an early concatenation strategy of depth, which yields higher mAP than previous
works' while using significantly fewer parameters.
Related papers
- Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Depth-discriminative Metric Learning for Monocular 3D Object Detection [14.554132525651868]
We introduce a novel metric learning scheme that encourages the model to extract depth-discriminative features regardless of the visual attributes.
Our method consistently improves the performance of various baselines by 23.51% and 5.78% on average.
arXiv Detail & Related papers (2024-01-02T07:34:09Z) - Depth-Relative Self Attention for Monocular Depth Estimation [23.174459018407003]
deep neural networks rely on various visual hints such as size, shade, and texture extracted from RGB information.
We propose a novel depth estimation model named RElative Depth Transformer (RED-T) that uses relative depth as guidance in self-attention.
We show that the proposed model achieves competitive results in monocular depth estimation benchmarks and is less biased to RGB information.
arXiv Detail & Related papers (2023-04-25T14:20:31Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Depth-Guided Camouflaged Object Detection [31.99397550848777]
Research in biology suggests that depth can provide useful object localization cues for camouflaged object discovery.
depth information has not been exploited for camouflaged object detection.
We present a depth-guided camouflaged object detection network with pre-computed depth maps from existing monocular depth estimation methods.
arXiv Detail & Related papers (2021-06-24T17:51:31Z) - Depth Completion using Plane-Residual Representation [84.63079529738924]
We introduce a novel way of interpreting depth information with the closest depth plane label $p$ and a residual value $r$, as we call it, Plane-Residual (PR) representation.
By interpreting depth information in PR representation and using our corresponding depth completion network, we were able to acquire improved depth completion performance with faster computation.
arXiv Detail & Related papers (2021-04-15T10:17:53Z) - Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion.
By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.