Object Semantics Give Us the Depth We Need: Multi-task Approach to
Aerial Depth Completion
- URL: http://arxiv.org/abs/2304.12542v1
- Date: Tue, 25 Apr 2023 03:21:32 GMT
- Title: Object Semantics Give Us the Depth We Need: Multi-task Approach to
Aerial Depth Completion
- Authors: Sara Hatami Gazani, Fardad Dadboud, Miodrag Bolic, Iraj Mantegh,
Homayoun Najjaran
- Abstract summary: We propose a novel approach to jointly execute the two tasks in a single pass.
The proposed method is based on an encoder-focused multi-task learning model that exposes the two tasks to jointly learned features.
Experimental results show that the proposed multi-task network outperforms its single-task counterpart.
- Score: 1.2239546747355885
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Depth completion and object detection are two crucial tasks often used for
aerial 3D mapping, path planning, and collision avoidance of Uncrewed Aerial
Vehicles (UAVs). Common solutions include using measurements from a LiDAR
sensor; however, the generated point cloud is often sparse and irregular and
limits the system's capabilities in 3D rendering and safety-critical
decision-making. To mitigate this challenge, information from other sensors on
the UAV (viz., a camera used for object detection) is utilized to help the
depth completion process generate denser 3D models. Performing both aerial
depth completion and object detection tasks while fusing the data from the two
sensors poses a challenge to resource efficiency. We address this challenge by
proposing a novel approach to jointly execute the two tasks in a single pass.
The proposed method is based on an encoder-focused multi-task learning model
that exposes the two tasks to jointly learned features. We demonstrate how
semantic expectations of the objects in the scene learned by the object
detection pathway can boost the performance of the depth completion pathway
while placing the missing depth values. Experimental results show that the
proposed multi-task network outperforms its single-task counterpart,
particularly when exposed to defective inputs.
Related papers
- OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection [102.0744303467713]
We propose a new multi-view 3D object detector named OPEN.
Our main idea is to effectively inject object-wise depth information into the network through our proposed object-wise position embedding.
OPEN achieves a new state-of-the-art performance with 64.4% NDS and 56.7% mAP on the nuScenes test benchmark.
arXiv Detail & Related papers (2024-07-15T14:29:15Z) - Depth-discriminative Metric Learning for Monocular 3D Object Detection [14.554132525651868]
We introduce a novel metric learning scheme that encourages the model to extract depth-discriminative features regardless of the visual attributes.
Our method consistently improves the performance of various baselines by 23.51% and 5.78% on average.
arXiv Detail & Related papers (2024-01-02T07:34:09Z) - MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in
Adverse Scenes [49.21187418886508]
This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP.
We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors.
Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
arXiv Detail & Related papers (2023-05-18T13:42:02Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Geometry Uncertainty Projection Network for Monocular 3D Object
Detection [138.24798140338095]
We propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
Specifically, a GUP module is proposed to obtains the geometry-guided uncertainty of the inferred depth.
At the training stage, we propose a Hierarchical Task Learning strategy to reduce the instability caused by error amplification.
arXiv Detail & Related papers (2021-07-29T06:59:07Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Multi-Task Multi-Sensor Fusion for 3D Object Detection [93.68864606959251]
We present an end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion.
Our experiments show that all these tasks are complementary and help the network learn better representations by fusing information at various levels.
arXiv Detail & Related papers (2020-12-22T22:49:15Z) - Monocular 3D Object Detection with Sequential Feature Association and
Depth Hint Augmentation [12.55603878441083]
FADNet is presented to address the task of monocular 3D object detection.
A dedicated depth hint module is designed to generate row-wise features named as depth hints.
The contributions of this work are validated by conducting experiments and ablation study on the KITTI benchmark.
arXiv Detail & Related papers (2020-11-30T07:19:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.