Unsupervised Depth and Ego-motion Estimation for Monocular Thermal Video
using Multi-spectral Consistency Loss
- URL: http://arxiv.org/abs/2103.00760v2
- Date: Wed, 3 Mar 2021 02:05:01 GMT
- Title: Unsupervised Depth and Ego-motion Estimation for Monocular Thermal Video
using Multi-spectral Consistency Loss
- Authors: Ukcheol Shin, Kyunghyun Lee, Seokju Lee, In So Kweon
- Abstract summary: We propose an unsupervised learning method for the all-day depth and ego-motion estimation.
The proposed method exploits multi-spectral consistency loss to gives complementary supervision for the networks.
Networks trained with the proposed method robustly estimate the depth and pose from monocular thermal video under low-light and even zero-light conditions.
- Score: 76.77673212431152
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of the deep-learning based depth and ego-motion networks have been
designed for visible cameras. However, visible cameras heavily rely on the
presence of an external light source. Therefore, it is challenging to use them
under low-light conditions such as night scenes, tunnels, and other harsh
conditions. A thermal camera is one solution to compensate for this problem
because it detects Long Wave Infrared Radiation(LWIR) regardless of any
external light sources. However, despite this advantage, both depth and
ego-motion estimation research for the thermal camera are not actively explored
until so far. In this paper, we propose an unsupervised learning method for the
all-day depth and ego-motion estimation. The proposed method exploits
multi-spectral consistency loss to gives complementary supervision for the
networks by reconstructing visible and thermal images with the depth and pose
estimated from thermal images. The networks trained with the proposed method
robustly estimate the depth and pose from monocular thermal video under
low-light and even zero-light conditions. To the best of our knowledge, this is
the first work to simultaneously estimate both depth and ego-motion from the
monocular thermal video in an unsupervised manner.
Related papers
- Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints.
To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking.
Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z) - Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging
Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework.
We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas.
With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z) - RIDERS: Radar-Infrared Depth Estimation for Robust Sensing [22.10378524682712]
Adverse weather conditions pose significant challenges to accurate dense depth estimation.
We present a novel approach for robust metric depth estimation by fusing a millimeter-wave Radar and a monocular infrared thermal camera.
Our method achieves exceptional visual quality and accurate metric estimation by addressing the challenges of ambiguity and misalignment.
arXiv Detail & Related papers (2024-02-03T07:14:43Z) - Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation
from Dual-Modality Cameras [33.77748026254935]
Cross-spectrum depth estimation aims to provide a depth map in all illumination conditions with a pair of dual-spectrum images.
In this paper, we propose an unsupervised visible-light image guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework.
Our method achieves better performance than the compared existing methods.
arXiv Detail & Related papers (2022-04-30T12:58:35Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Maximizing Self-supervision from Thermal Image for Effective
Self-supervised Learning of Depth and Ego-motion [78.19156040783061]
Self-supervised learning of depth and ego-motion from thermal images shows strong robustness and reliability under challenging scenarios.
The inherent thermal image properties such as weak contrast, blurry edges, and noise hinder to generate effective self-supervision from thermal images.
We propose an effective thermal image mapping method that significantly increases image information, such as overall structure, contrast, and details, while preserving temporal consistency.
arXiv Detail & Related papers (2022-01-12T09:49:24Z) - Full Surround Monodepth from Multiple Cameras [31.145598985137468]
We extend self-supervised monocular depth and ego-motion estimation to large photo-baseline multi-camera rigs.
We learn a single network generating dense, consistent, and scale-aware point clouds that cover the same full surround 360 degree field of view as a typical LiDAR scanner.
arXiv Detail & Related papers (2021-03-31T22:52:04Z) - Self-Attention Dense Depth Estimation Network for Unrectified Video
Sequences [6.821598757786515]
LiDAR and radar sensors are the hardware solution for real-time depth estimation.
Deep learning based self-supervised depth estimation methods have shown promising results.
We propose a self-attention based depth and ego-motion network for unrectified images.
arXiv Detail & Related papers (2020-05-28T21:53:53Z) - DiPE: Deeper into Photometric Errors for Unsupervised Learning of Depth
and Ego-motion from Monocular Videos [9.255509741319583]
This paper shows that carefully manipulating photometric errors can tackle these difficulties better.
The primary improvement is achieved by a statistical technique that can mask out the invisible or nonstationary pixels in the photometric error map.
We also propose an efficient weighted multi-scale scheme to reduce the artifacts in the predicted depth maps.
arXiv Detail & Related papers (2020-03-03T07:05:15Z) - Video Depth Estimation by Fusing Flow-to-Depth Proposals [65.24533384679657]
We present an approach with a differentiable flow-to-depth layer for video depth estimation.
The model consists of a flow-to-depth layer, a camera pose refinement module, and a depth fusion network.
Our approach outperforms state-of-the-art depth estimation methods, and has reasonable cross dataset generalization capability.
arXiv Detail & Related papers (2019-12-30T10:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.