UDPNet: Unleashing Depth-based Priors for Robust Image Dehazing
- URL: http://arxiv.org/abs/2601.06909v2
- Date: Sun, 18 Jan 2026 07:53:45 GMT
- Title: UDPNet: Unleashing Depth-based Priors for Robust Image Dehazing
- Authors: Zengyuan Zuo, Junjun Jiang, Gang Wu, Xianming Liu,
- Abstract summary: UDPNet is a general framework that leverages depth-based priors from a large-scale pretrained depth estimation model DepthAnything V2.<n>Our proposed solution establishes a new benchmark for depth-aware dehazing across various scenarios.
- Score: 77.10640210751981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image dehazing has witnessed significant advancements with the development of deep learning models. However, most existing methods focus solely on single-modal RGB features, neglecting the inherent correlation between scene depth and haze distribution. Even those that jointly optimize depth estimation and image dehazing often suffer from suboptimal performance due to inadequate utilization of accurate depth information. In this paper, we present UDPNet, a general framework that leverages depth-based priors from a large-scale pretrained depth estimation model DepthAnything V2 to boost existing image dehazing models. Specifically, our architecture comprises two key components: the Depth-Guided Attention Module (DGAM) adaptively modulates features via lightweight depth-guided channel attention, and the Depth Prior Fusion Module (DPFM) enables hierarchical fusion of multi-scale depth map features by dual sliding-window multi-head cross-attention mechanism. These modules ensure both computational efficiency and effective integration of depth priors. Moreover, the depth priors empower the network to dynamically adapt to varying haze densities, illumination conditions, and domain gaps across synthetic and real-world data. Extensive experimental results demonstrate the effectiveness of our UDPNet, outperforming the state-of-the-art methods on popular dehazing datasets, with PSNR improvements of 0.85 dB on SOTS-indoor, 1.19 dB on Haze4K, and 1.79 dB on NHR. Our proposed solution establishes a new benchmark for depth-aware dehazing across various scenarios. Pretrained models and codes are released at our project https://github.com/Harbinzzy/UDPNet.
Related papers
- StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion [56.28564075246147]
StarryGazer is a framework that predicts dense depth images from a single sparse depth image and an RGB image.<n>We employ a pre-trained MDE model to produce relative depth images.<n>A refinement network is trained with the synthetic pairs, incorporating the relative depth maps and RGB images to improve the model's accuracy and robustness.
arXiv Detail & Related papers (2025-12-15T09:56:09Z) - BoRe-Depth: Self-supervised Monocular Depth Estimation with Boundary Refinement for Embedded Systems [14.113247032011282]
We propose a novel monocular depth estimation model, BoRe-Depth, which contains only 8.7M parameters.<n>It can accurately estimate depth maps on embedded systems and significantly improves boundary quality.<n>BoRe-Depth is deployed on NVIDIA Jetson Orin, and runs efficiently at 50.7 FPS.
arXiv Detail & Related papers (2025-11-06T14:17:33Z) - Propagating Sparse Depth via Depth Foundation Model for Out-of-Distribution Depth Completion [33.854696587141355]
We propose a novel depth completion framework that leverages depth foundation models to attain remarkable robustness without large-scale training.<n>Specifically, we leverage a depth foundation model to extract environmental cues, including structural and semantic context, from RGB images to guide the propagation of sparse depth information into missing regions.<n>Our framework performs remarkably well in the OOD scenarios and outperforms existing state-of-the-art depth completion methods.
arXiv Detail & Related papers (2025-08-07T02:38:24Z) - High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy [23.431898388115044]
High-precision dichotomous image segmentation (DIS) is a task of extracting fine-grained objects from high-resolution images.<n>Existing methods face a dilemma: non-diffusion methods work efficiently but suffer from false or missed detections due to weak semantics.<n>We find pseudo depth information from monocular depth estimation models can provide essential semantic understanding.
arXiv Detail & Related papers (2025-03-08T07:02:28Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Mask-adaptive Gated Convolution and Bi-directional Progressive Fusion Network for Depth Completion [3.5940515868907164]
We propose a new model for depth completion based on an encoder-decoder structure.<n>Our model introduces two key components: the Mask-adaptive Gated Convolution architecture and the Bi-directional Progressive Fusion module.<n>We achieve remarkable performance in completing depth maps and outperformed existing approaches in terms of accuracy and reliability.
arXiv Detail & Related papers (2024-01-15T02:58:06Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation [14.81943833870932]
We present an improvedDepthNet, HR-Depth, with two effective strategies.
Using Resnet-18 as the encoder, HR-Depth surpasses all pre-vious state-of-the-art(SoTA) methods with the least param-eters at both high and low resolution.
arXiv Detail & Related papers (2020-12-14T09:15:15Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.