Depth-agnostic Single Image Dehazing
- URL: http://arxiv.org/abs/2401.07213v1
- Date: Sun, 14 Jan 2024 06:33:11 GMT
- Title: Depth-agnostic Single Image Dehazing
- Authors: Honglei Xu and Yan Shu and Shaohui Liu
- Abstract summary: We propose a simple yet novel synthetic method to decouple the relationship between haze density and scene depth, by which a depth-agnostic dataset (DA-HAZE) is generated.
Experiments indicate that models trained on DA-HAZE achieve significant improvements on real-world benchmarks, with less discrepancy between SOTS and DA-SOTS.
We revisit the U-Net-based architectures for dehazing, in which dedicatedly designed blocks are incorporated.
- Score: 12.51359372069387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Single image dehazing is a challenging ill-posed problem. Existing datasets
for training deep learning-based methods can be generated by hand-crafted or
synthetic schemes. However, the former often suffers from small scales, while
the latter forces models to learn scene depth instead of haze distribution,
decreasing their dehazing ability. To overcome the problem, we propose a simple
yet novel synthetic method to decouple the relationship between haze density
and scene depth, by which a depth-agnostic dataset (DA-HAZE) is generated.
Meanwhile, a Global Shuffle Strategy (GSS) is proposed for generating
differently scaled datasets, thereby enhancing the generalization ability of
the model. Extensive experiments indicate that models trained on DA-HAZE
achieve significant improvements on real-world benchmarks, with less
discrepancy between SOTS and DA-SOTS (the test set of DA-HAZE). Additionally,
Depth-agnostic dehazing is a more complicated task because of the lack of depth
prior. Therefore, an efficient architecture with stronger feature modeling
ability and fewer computational costs is necessary. We revisit the U-Net-based
architectures for dehazing, in which dedicatedly designed blocks are
incorporated. However, the performances of blocks are constrained by limited
feature fusion methods. To this end, we propose a Convolutional Skip Connection
(CSC) module, allowing vanilla feature fusion methods to achieve promising
results with minimal costs. Extensive experimental results demonstrate that
current state-of-the-art methods. equipped with CSC can achieve better
performance and reasonable computational expense, whether the haze distribution
is relevant to the scene depth.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation [33.140210057065644]
This paper introduces a novel approach named Stealing Stable Diffusion (SSD) prior for robust monocular depth estimation.
The approach addresses this limitation by utilizing stable diffusion to generate synthetic images that mimic challenging conditions.
The effectiveness of the approach is evaluated on nuScenes and Oxford RobotCar, two challenging public datasets.
arXiv Detail & Related papers (2024-03-08T05:06:31Z) - Lightweight Diffusion Models with Distillation-Based Block Neural
Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS)
Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher.
Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Towards Efficient Deep Hashing Retrieval: Condensing Your Data via
Feature-Embedding Matching [7.908244841289913]
The expenses involved in training state-of-the-art deep hashing retrieval models have witnessed an increase.
The state-of-the-art dataset distillation methods can not expand to all deep hashing retrieval methods.
We propose an efficient condensation framework that addresses these limitations by matching the feature-embedding between synthetic set and real set.
arXiv Detail & Related papers (2023-05-29T13:23:55Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z) - Perceptron Synthesis Network: Rethinking the Action Scale Variances in
Videos [48.57686258913474]
Video action recognition has been partially addressed by the CNNs stacking of fixed-size 3D kernels.
We propose to learn the optimal-scale kernels from the data.
An textitaction perceptron synthesizer is proposed to generate the kernels from a bag of fixed-size kernels.
arXiv Detail & Related papers (2020-07-22T14:22:29Z) - DeeSCo: Deep heterogeneous ensemble with Stochastic Combinatory loss for
gaze estimation [7.09232719022402]
We introduce a deep, end-to-end trainable ensemble of heatmap-based weak predictors for 2D/3D gaze estimation.
We show that our ensemble outperforms state-of-the-art approaches for 2D/3D gaze estimation on multiple datasets.
arXiv Detail & Related papers (2020-04-15T14:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.