SnowFormer: Scale-aware Transformer via Context Interaction for Single
Image Desnowing
- URL: http://arxiv.org/abs/2208.09703v2
- Date: Tue, 23 Aug 2022 06:16:19 GMT
- Title: SnowFormer: Scale-aware Transformer via Context Interaction for Single
Image Desnowing
- Authors: Sixiang Chen, Tian Ye, Yun Liu, Erkang Chen, Jun Shi, Jingchun Zhou
- Abstract summary: We propose a powerful architecture dubbed as SnowFormer for single image desnowing.
It performs Scale-aware Feature Aggregation in the encoder to capture rich snow information of various degradations.
It also uses a novel Context Interaction Transformer Block in the decoder, which conducts context interaction of local details and global information.
- Score: 9.747362856056162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single image desnowing is a common yet challenging task. The complex snow
degradations and diverse degradation scales demand strong representation
ability. In order for the desnowing network to see various snow degradations
and model the context interaction of local details and global information, we
propose a powerful architecture dubbed as SnowFormer. First, it performs
Scale-aware Feature Aggregation in the encoder to capture rich snow information
of various degradations. Second, in order to tackle with large-scale
degradation, it uses a novel Context Interaction Transformer Block in the
decoder, which conducts context interaction of local details and global
information from previous scale-aware feature aggregation in global context
interaction. And the introduction of local context interaction improves
recovery of scene details. Third, we devise a Heterogeneous Feature Projection
Head which progressively fuse features from both the encoder and decoder and
project the refined feature into the clean image. Extensive experiments
demonstrate that the proposed SnowFormer achieves significant improvements over
other SOTA methods. Compared with SOTA single image desnowing method HDCW-Net,
it boosts the PSNR metric by 9.2dB on the CSD testset. Moreover, it also
achieves a 5.13dB increase in PSNR compared with general image restoration
architecture NAFNet, which verifies the strong representation ability of our
SnowFormer for snow removal task. The code is released in
\url{https://github.com/Ephemeral182/SnowFormer}.
Related papers
- Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization [21.22179604024444]
We present a new paradigm for video desnowing in a semi-supervised spirit to involve unlabeled real data for the generalizable snow removal.
Specifically, we construct a real-world dataset with 85 snowy videos, and then present a Semi-supervised Video Desnowing Network (SemiVDN) equipped by a novel Distribution-driven Contrastive Regularization.
The elaborated contrastive regularizations mitigate the distribution gap between the synthetic and real data, and consequently maintains the desired snow-invariant background details.
arXiv Detail & Related papers (2024-10-10T13:31:42Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - A Mountain-Shaped Single-Stage Network for Accurate Image Restoration [9.431709365739462]
In image restoration, it is typically necessary to maintain a complex balance between spatial details and contextual information.
We propose a single-stage design base on a simple U-Net architecture, which removes or replaces unnecessary nonlinear activation functions.
Our approach, named as M3SNet, outperforms previous state-of-the-art models while using less than half the computational costs.
arXiv Detail & Related papers (2023-05-09T03:18:35Z) - Star-Net: Improving Single Image Desnowing Model With More Efficient
Connection and Diverse Feature Interaction [0.8602553195689513]
We propose a novel single image desnowing network called Star-Net.
First, we design a Star type Skip Connection (SSC) to establish information channels for all different scale features.
Second, we present a Multi-Stage Interactive Transformer (MIT) as the base module of Star-Net.
Third, we propose a Degenerate Filter Module (DFM) to filter the snow particle and snow fog residual in the SSC.
arXiv Detail & Related papers (2023-03-17T14:03:49Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - MSP-Former: Multi-Scale Projection Transformer for Single Image
Desnowing [6.22867695581195]
We apply the vision transformer to the task of snow removal from a single image.
We propose a parallel network architecture split along the channel, performing local feature refinement and global information modeling separately.
In the experimental part, we conduct extensive experiments to demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-07-12T15:44:07Z) - Towards Real-time High-Definition Image Snow Removal: Efficient Pyramid
Network with Asymmetrical Encoder-decoder Architecture [6.682410871522934]
We develop a novel Efficient Pyramid Network with asymmetrical encoder-decoder architecture for real-time HD image desnowing.
Our approach achieves a better complexity-performance trade-off and effectively handles the processing difficulties of HD and Ultra-HD images.
arXiv Detail & Related papers (2022-07-12T15:18:41Z) - Deep Dense Multi-scale Network for Snow Removal Using Semantic and
Geometric Priors [78.61844008368587]
We propose a Deep Dense Multi-Scale Network (textbfDDMSNet) for snow removal by exploiting semantic and geometric priors.
We incorporate the semantic and geometric maps as input and learn the semantic-aware and geometry-aware representation to remove snow.
arXiv Detail & Related papers (2021-03-21T03:30:30Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.