WaveletFormerNet: A Transformer-based Wavelet Network for Real-world
Non-homogeneous and Dense Fog Removal
- URL: http://arxiv.org/abs/2401.04550v1
- Date: Tue, 9 Jan 2024 13:42:21 GMT
- Title: WaveletFormerNet: A Transformer-based Wavelet Network for Real-world
Non-homogeneous and Dense Fog Removal
- Authors: Shengli Zhang, Zhiyong Tao, and Sen Lin
- Abstract summary: This paper proposes a Transformer-based wavelet network (WaveletFormerNet) for real-world foggy image recovery.
We introduce parallel convolution in the Transformer block, which allows for the capture of multi-frequency information in a lightweight mechanism.
Our experiments demonstrate that our WaveletFormerNet performs better than state-of-the-art methods.
- Score: 11.757602977709517
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep convolutional neural networks have achieved remarkable success
in removing synthetic fog, it is essential to be able to process images taken
in complex foggy conditions, such as dense or non-homogeneous fog, in the real
world. However, the haze distribution in the real world is complex, and
downsampling can lead to color distortion or loss of detail in the output
results as the resolution of a feature map or image resolution decreases. In
addition to the challenges of obtaining sufficient training data, overfitting
can also arise in deep learning techniques for foggy image processing, which
can limit the generalization abilities of the model, posing challenges for its
practical applications in real-world scenarios. Considering these issues, this
paper proposes a Transformer-based wavelet network (WaveletFormerNet) for
real-world foggy image recovery. We embed the discrete wavelet transform into
the Vision Transformer by proposing the WaveletFormer and IWaveletFormer
blocks, aiming to alleviate texture detail loss and color distortion in the
image due to downsampling. We introduce parallel convolution in the Transformer
block, which allows for the capture of multi-frequency information in a
lightweight mechanism. Additionally, we have implemented a feature aggregation
module (FAM) to maintain image resolution and enhance the feature extraction
capacity of our model, further contributing to its impressive performance in
real-world foggy image recovery tasks. Extensive experiments demonstrate that
our WaveletFormerNet performs better than state-of-the-art methods, as shown
through quantitative and qualitative evaluations of minor model complexity.
Additionally, our satisfactory results on real-world dust removal and
application tests showcase the superior generalization ability and improved
performance of WaveletFormerNet in computer vision-related applications.
Related papers
- Efficient Face Super-Resolution via Wavelet-based Feature Enhancement Network [27.902725520665133]
Face super-resolution aims to reconstruct a high-resolution face image from a low-resolution face image.
Previous methods typically employ an encoder-decoder structure to extract facial structural features.
We propose a wavelet-based feature enhancement network, which mitigates feature distortion by losslessly decomposing the input feature into high and low-frequency components.
arXiv Detail & Related papers (2024-07-29T08:03:33Z) - Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution [6.367865391518726]
Transformer-based models have achieved remarkable results in low-level vision tasks including image super-resolution (SR)
To activate more input pixels globally, hybrid attention models have been proposed.
We employ wavelet losses to train Transformer models to improve quantitative and subjective performance.
arXiv Detail & Related papers (2024-04-17T11:25:19Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Physics-Driven Turbulence Image Restoration with Stochastic Refinement [80.79900297089176]
Image distortion by atmospheric turbulence is a critical problem in long-range optical imaging systems.
Fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions.
This paper proposes the Physics-integrated Restoration Network (PiRN) to help the network to disentangle theity from the degradation and the underlying image.
arXiv Detail & Related papers (2023-07-20T05:49:21Z) - Efficient Textured Mesh Recovery from Multiple Views with Differentiable
Rendering [8.264851594332677]
We propose an efficient coarse-to-fine approach to recover the textured mesh from multi-view images.
We optimize the shape geometry by minimizing the difference between the rendered mesh with the depth predicted by the learning-based multi-view stereo algorithm.
In contrast to the implicit neural representation on shape and color, we introduce a physically based inverse rendering scheme to jointly estimate the lighting and reflectance of the objects.
arXiv Detail & Related papers (2022-05-25T03:33:55Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - SDWNet: A Straight Dilated Network with Wavelet Transformation for Image
Deblurring [23.86692375792203]
Image deblurring is a computer vision problem that aims to recover a sharp image from a blurred image.
Our model uses dilated convolution to enable the obtainment of the large receptive field with high spatial resolution.
We propose a novel module using the wavelet transform, which effectively helps the network to recover clear high-frequency texture details.
arXiv Detail & Related papers (2021-10-12T07:58:10Z) - Wavelet Channel Attention Module with a Fusion Network for Single Image
Deraining [46.62290347397139]
Single image deraining is a crucial problem because rain severely degenerates the visibility of images.
We propose the new convolutional neural network (CNN) called the wavelet channel attention module with a fusion network.
arXiv Detail & Related papers (2020-07-17T18:06:13Z) - Invertible Image Rescaling [118.2653765756915]
We develop an Invertible Rescaling Net (IRN) to produce visually-pleasing low-resolution images.
We capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process.
arXiv Detail & Related papers (2020-05-12T09:55:53Z) - Gated Fusion Network for Degraded Image Super Resolution [78.67168802945069]
We propose a dual-branch convolutional neural network to extract base features and recovered features separately.
By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process.
arXiv Detail & Related papers (2020-03-02T13:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.