DEHRFormer: Real-time Transformer for Depth Estimation and Haze Removal
from Varicolored Haze Scenes
- URL: http://arxiv.org/abs/2303.06905v1
- Date: Mon, 13 Mar 2023 07:47:18 GMT
- Title: DEHRFormer: Real-time Transformer for Depth Estimation and Haze Removal
from Varicolored Haze Scenes
- Authors: Sixiang Chen, Tian Ye, Jun Shi, Yun Liu, JingXia Jiang, Erkang Chen,
Peng Chen
- Abstract summary: We propose a real-time transformer for simultaneous single image Depth Estimation and Haze Removal.
DEHRFormer consists of a single encoder and two task-specific decoders.
We introduce a novel learning paradigm that utilizes contrastive learning and domain consistency learning to tackle weak-generalization problem for real-world dehazing.
- Score: 10.174140482558904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Varicolored haze caused by chromatic casts poses haze removal and depth
estimation challenges. Recent learning-based depth estimation methods are
mainly targeted at dehazing first and estimating depth subsequently from
haze-free scenes. This way, the inner connections between colored haze and
scene depth are lost. In this paper, we propose a real-time transformer for
simultaneous single image Depth Estimation and Haze Removal (DEHRFormer).
DEHRFormer consists of a single encoder and two task-specific decoders. The
transformer decoders with learnable queries are designed to decode coupling
features from the task-agnostic encoder and project them into clean image and
depth map, respectively. In addition, we introduce a novel learning paradigm
that utilizes contrastive learning and domain consistency learning to tackle
weak-generalization problem for real-world dehazing, while predicting the same
depth map from the same scene with varicolored haze. Experiments demonstrate
that DEHRFormer achieves significant performance improvement across diverse
varicolored haze scenes over previous depth estimation networks and dehazing
approaches.
Related papers
- HazyDet: Open-source Benchmark for Drone-view Object Detection with Depth-cues in Hazy Scenes [31.411806708632437]
We introduce HazyDet, a dataset tailored for drone-based object detection in hazy scenes.
It encompasses 383,000 real-world instances, collected from both naturally hazy environments and normal scenes.
By observing the significant variations in object scale and clarity under different depth and haze conditions, we designed a Depth Conditioned Detector.
arXiv Detail & Related papers (2024-09-30T00:11:40Z) - Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Back to the Color: Learning Depth to Specific Color Transformation for Unsupervised Depth Estimation [45.07558105128673]
discrepancies between synthetic and real-world colors pose significant challenges for depth estimation in real-world scenes.
We propose Back2Color, a framework that predicts realistic colors from depth using a model trained on real-world data.
We also present VADepth, based on the Vision Attention Network, which offers lower computational complexity and higher accuracy than transformers.
arXiv Detail & Related papers (2024-06-11T21:55:20Z) - Depth Estimation from a Single Optical Encoded Image using a Learned
Colored-Coded Aperture [18.830374973687416]
State-of-the-art approaches improve the discrimination between different depths by introducing a binary-coded aperture (CA) in the lens aperture.
Color-coded apertures (CCA) can also produce color misalignment in the captured image which can be utilized to estimate disparity.
We propose a CCA with a greater number of color filters and richer spectral information to optically encode relevant depth information in a single snapshot.
arXiv Detail & Related papers (2023-09-14T21:30:55Z) - SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency [51.92434113232977]
This work presents an effective depth-consistency self-prompt Transformer for image dehazing.
It is motivated by an observation that the estimated depths of an image with haze residuals and its clear counterpart vary.
By incorporating the prompt, prompt embedding, and prompt attention into an encoder-decoder network based on VQGAN, we can achieve better perception quality.
arXiv Detail & Related papers (2023-03-13T11:47:24Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - Progressive Depth Learning for Single Image Dehazing [56.71963910162241]
Existing dehazing methods often ignore the depth cues and fail in distant areas where heavier haze disturbs the visibility.
We propose a deep end-to-end model that iteratively estimates image depths and transmission maps.
Our approach benefits from explicitly modeling the inner relationship of image depth and transmission map, which is especially effective for distant hazy areas.
arXiv Detail & Related papers (2021-02-21T05:24:18Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data [110.29043712400912]
We present a method for depth estimation with monocular images, which can predict high-quality depth on diverse scenes up to an affine transformation.
Experiments show that our method outperforms previous methods on 8 datasets by a large margin with the zero-shot test setting.
arXiv Detail & Related papers (2020-02-03T05:38:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.