Magic ELF: Image Deraining Meets Association Learning and Transformer
- URL: http://arxiv.org/abs/2207.10455v1
- Date: Thu, 21 Jul 2022 12:50:54 GMT
- Title: Magic ELF: Image Deraining Meets Association Learning and Transformer
- Authors: Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui,
Chia-Wen Lin
- Abstract summary: This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining.
A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery.
Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
- Score: 63.761812092934576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural network (CNN) and Transformer have achieved great
success in multimedia applications. However, little effort has been made to
effectively and efficiently harmonize these two architectures to satisfy image
deraining. This paper aims to unify these two architectures to take advantage
of their learning merits for image deraining. In particular, the local
connectivity and translation equivariance of CNN and the global aggregation
ability of self-attention (SA) in Transformer are fully exploited for specific
local context and global structure representations. Based on the observation
that rain distribution reveals the degradation location and degree, we
introduce degradation prior to help background recovery and accordingly present
the association refinement deraining scheme. A novel multi-input attention
module (MAM) is proposed to associate rain perturbation removal and background
recovery. Moreover, we equip our model with effective depth-wise separable
convolutions to learn the specific feature representations and trade off
computational complexity. Extensive experiments show that our proposed method
(dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB
on average, but only accounts for 11.7\% and 42.1\% of its computational cost
and parameters. The source code is available at
https://github.com/kuijiang94/Magic-ELF.
Related papers
- Double-Shot 3D Shape Measurement with a Dual-Branch Network [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.
Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.
We show that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration [46.96362010335177]
In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration.
Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images.
In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM.
arXiv Detail & Related papers (2024-03-30T08:05:00Z) - FuseFormer: A Transformer for Visual and Thermal Image Fusion [3.6064695344878093]
We propose a novel methodology for the image fusion problem that mitigates the limitations associated with using classical evaluation metrics as loss functions.
Our approach integrates a transformer-based multi-scale fusion strategy that adeptly addresses local and global context information.
Our proposed method, along with the novel loss function definition, demonstrates superior performance compared to other competitive fusion algorithms.
arXiv Detail & Related papers (2024-02-01T19:40:39Z) - Dynamic Association Learning of Self-Attention and Convolution in Image
Restoration [56.49098856632478]
CNNs and Self attention have achieved great success in multimedia applications for dynamic association learning of self-attention and convolution in image restoration.
This paper proposes an association learning method to utilize the advantages and suppress their shortcomings, so as to achieve high-quality and efficient inpainting.
arXiv Detail & Related papers (2023-11-09T05:11:24Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Semi-Federated Learning: Convergence Analysis and Optimization of A
Hybrid Learning Framework [70.83511997272457]
We propose a semi-federated learning (SemiFL) paradigm to leverage both the base station (BS) and devices for a hybrid implementation of centralized learning (CL) and FL.
We propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers.
arXiv Detail & Related papers (2023-10-04T03:32:39Z) - Transformer-based Context Condensation for Boosting Feature Pyramids in
Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF)
We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results.
In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.