Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation
from Dual-Modality Cameras
- URL: http://arxiv.org/abs/2205.00257v1
- Date: Sat, 30 Apr 2022 12:58:35 GMT
- Title: Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation
from Dual-Modality Cameras
- Authors: Yubin Guo, Haobo Jiang, Xinlei Qi, Jin Xie, Cheng-Zhong Xu and Hui
Kong
- Abstract summary: Cross-spectrum depth estimation aims to provide a depth map in all illumination conditions with a pair of dual-spectrum images.
In this paper, we propose an unsupervised visible-light image guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework.
Our method achieves better performance than the compared existing methods.
- Score: 33.77748026254935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-spectrum depth estimation aims to provide a depth map in all
illumination conditions with a pair of dual-spectrum images. It is valuable for
autonomous vehicle applications when the vehicle is equipped with two cameras
of different modalities. However, images captured by different-modality cameras
can be photometrically quite different. Therefore, cross-spectrum depth
estimation is a very challenging problem. Moreover, the shortage of large-scale
open-source datasets also retards further research in this field. In this
paper, we propose an unsupervised visible-light image guided cross-spectrum
(i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework
given a pair of RGB and thermal images captured from a visible-light camera and
a thermal one. We first adopt a base depth estimation network using RGB-image
pairs. Then we propose a multi-scale feature transfer network to transfer
features from the TIR-VIS domain to the VIS domain at the feature level to fit
the trained depth estimation network. At last, we propose a cross-spectrum
depth cycle consistency to improve the depth result of dual-spectrum image
pairs. Meanwhile, we release a large dual-spectrum depth estimation dataset
with visible-light and far-infrared stereo images captured in different scenes
to the society. The experiment result shows that our method achieves better
performance than the compared existing methods. Our datasets is available at
https://github.com/whitecrow1027/VIS-TIR-Datasets.
Related papers
- Cross-spectral Gated-RGB Stereo Depth Estimation [34.31592077757453]
Gated cameras flood-illuminate a scene and capture the time-gated impulse response of a scene.
We propose a novel stereo-depth estimation method that is capable of exploiting these multi-modal multi-view depth cues.
The proposed method achieves accurate depth at long ranges, outperforming the next best existing method by 39% for ranges of 100 to 220m in MAE on accumulated LiDAR ground-truth.
arXiv Detail & Related papers (2024-05-21T13:10:43Z) - RBF Weighted Hyper-Involution for RGB-D Object Detection [0.0]
We propose a real-time and two stream RGBD object detection model.
The proposed model consists of two new components: a depth guided hyper-involution that adapts dynamically based on the spatial interaction pattern in the raw depth map and an up-sampling based trainable fusion layer.
We show that the proposed model outperforms other RGB-D based object detection models on NYU Depth v2 dataset and achieves comparable (second best) results on SUN RGB-D.
arXiv Detail & Related papers (2023-09-30T11:25:34Z) - AGG-Net: Attention Guided Gated-convolutional Network for Depth Image
Completion [1.8820731605557168]
We propose a new model for depth image completion based on the Attention Guided Gated-convolutional Network (AGG-Net)
In the encoding stage, an Attention Guided Gated-Convolution (AG-GConv) module is proposed to realize the fusion of depth and color features at different scales.
In the decoding stage, an Attention Guided Skip Connection (AG-SC) module is presented to avoid introducing too many depth-irrelevant features to the reconstruction.
arXiv Detail & Related papers (2023-09-04T14:16:08Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.