Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method
based on Fast Fourier Convolution and ConvNeXt
- URL: http://arxiv.org/abs/2305.04430v1
- Date: Mon, 8 May 2023 02:59:02 GMT
- Title: Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method
based on Fast Fourier Convolution and ConvNeXt
- Authors: Han Zhou, Wei Dong, Yangyi Liu and Jun Chen
- Abstract summary: Haze usually leads to deteriorated images with low contrast, color shift and structural distortion.
We propose a novel two branch network that leverages 2D discrete wavelete transform (DWT), fast Fourier convolution (FFC) residual block and a pretrained ConvNeXt model.
Our model is able to effectively explore global contextual information and produce images with better perceptual quality.
- Score: 14.917290578644424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Haze usually leads to deteriorated images with low contrast, color shift and
structural distortion. We observe that many deep learning based models exhibit
exceptional performance on removing homogeneous haze, but they usually fail to
address the challenge of non-homogeneous dehazing. Two main factors account for
this situation. Firstly, due to the intricate and non uniform distribution of
dense haze, the recovery of structural and chromatic features with high
fidelity is challenging, particularly in regions with heavy haze. Secondly, the
existing small scale datasets for non-homogeneous dehazing are inadequate to
support reliable learning of feature mappings between hazy images and their
corresponding haze-free counterparts by convolutional neural network
(CNN)-based models. To tackle these two challenges, we propose a novel two
branch network that leverages 2D discrete wavelete transform (DWT), fast
Fourier convolution (FFC) residual block and a pretrained ConvNeXt model.
Specifically, in the DWT-FFC frequency branch, our model exploits DWT to
capture more high-frequency features. Moreover, by taking advantage of the
large receptive field provided by FFC residual blocks, our model is able to
effectively explore global contextual information and produce images with
better perceptual quality. In the prior knowledge branch, an ImageNet
pretrained ConvNeXt as opposed to Res2Net is adopted. This enables our model to
learn more supplementary information and acquire a stronger generalization
ability. The feasibility and effectiveness of the proposed method is
demonstrated via extensive experiments and ablation studies. The code is
available at https://github.com/zhouh115/DWT-FFC.
Related papers
- LinFusion: 1 GPU, 1 Minute, 16K Image [71.44735417472043]
We introduce a low-rank approximation of a wide spectrum of popular linear token mixers.
We find that the distilled model, termed LinFusion, achieves performance on par with or superior to the original SD.
Experiments on SD-v1.5, SD-v2.1, and SD-XL demonstrate that LinFusion enables satisfactory and efficient zero-shot cross-resolution generation.
arXiv Detail & Related papers (2024-09-03T17:54:39Z) - CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models [52.29804282879437]
CFG++ is a novel approach that tackles the offmanifold challenges inherent to traditional CFG.
It offers better inversion-to-image generation, invertibility, smaller guidance scales, reduced mode collapse, etc.
It can be easily integrated into high-order diffusion solvers and naturally extends to distilled diffusion models.
arXiv Detail & Related papers (2024-06-12T10:40:10Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Frequency Compensated Diffusion Model for Real-scene Dehazing [6.105813272271171]
We consider a dehazing framework based on conditional diffusion models for improved generalization to real haze.
The proposed dehazing diffusion model significantly outperforms state-of-the-art methods on real-world images.
arXiv Detail & Related papers (2023-08-21T06:50:44Z) - DiffDis: Empowering Generative Diffusion Model with Cross-Modal
Discrimination Capability [75.9781362556431]
We propose DiffDis to unify the cross-modal generative and discriminative pretraining into one single framework under the diffusion process.
We show that DiffDis outperforms single-task models on both the image generation and the image-text discriminative tasks.
arXiv Detail & Related papers (2023-08-18T05:03:48Z) - Learning A Coarse-to-Fine Diffusion Transformer for Image Restoration [39.071637725773314]
We propose a coarse-to-fine diffusion Transformer (C2F-DFT) for image restoration.
C2F-DFT contains diffusion self-attention (DFSA) and diffusion feed-forward network (DFN)
In the coarse training stage, our C2F-DFT estimates noises and then generates the final clean image by a sampling algorithm.
arXiv Detail & Related papers (2023-08-17T01:59:59Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Self-Regression Learning for Blind Hyperspectral Image Fusion Without
Label [11.291055330647977]
We propose a self-regression learning method that reconstructs hyperspectral image (HSI) and estimate the observation model.
In particular, we adopt an invertible neural network (INN) for restoring the HSI, and two fully-connected networks (FCN) for estimating the observation model.
Our model can outperform the state-of-the-art methods in experiments on both synthetic and real-world dataset.
arXiv Detail & Related papers (2021-03-31T04:48:21Z) - A GAN-Based Input-Size Flexibility Model for Single Image Dehazing [16.83211957781034]
This paper concentrates on the challenging task of single image dehazing.
We design a novel model to directly generate the haze-free image.
Considering this reason and various image sizes, we propose a novel input-size flexibility conditional generative adversarial network (cGAN) for single image dehazing.
arXiv Detail & Related papers (2021-02-19T08:27:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.