Towards an Effective and Efficient Transformer for Rain-by-snow Weather
Removal
- URL: http://arxiv.org/abs/2304.02860v2
- Date: Fri, 27 Oct 2023 09:45:18 GMT
- Title: Towards an Effective and Efficient Transformer for Rain-by-snow Weather
Removal
- Authors: Tao Gao, Yuanbo Wen, Kaihao Zhang, Peng Cheng, and Ting Chen
- Abstract summary: Rain-by-snow weather removal is a specialized task in weather-degraded image restoration aiming to eliminate coexisting rain streaks and snow particles.
We propose RSFormer, an efficient and effective Transformer that addresses this challenge.
RSFormer achieves the best trade-off between performance and time-consumption compared to other restoration methods.
- Score: 23.224536745724077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rain-by-snow weather removal is a specialized task in weather-degraded image
restoration aiming to eliminate coexisting rain streaks and snow particles. In
this paper, we propose RSFormer, an efficient and effective Transformer that
addresses this challenge. Initially, we explore the proximity of convolution
networks (ConvNets) and vision Transformers (ViTs) in hierarchical
architectures and experimentally find they perform approximately at intra-stage
feature learning. On this basis, we utilize a Transformer-like convolution
block (TCB) that replaces the computationally expensive self-attention while
preserving attention characteristics for adapting to input content. We also
demonstrate that cross-stage progression is critical for performance
improvement, and propose a global-local self-attention sampling mechanism
(GLASM) that down-/up-samples features while capturing both global and local
dependencies. Finally, we synthesize two novel rain-by-snow datasets,
RSCityScape and RS100K, to evaluate our proposed RSFormer. Extensive
experiments verify that RSFormer achieves the best trade-off between
performance and time-consumption compared to other restoration methods. For
instance, it outperforms Restormer with a 1.53% reduction in the number of
parameters and a 15.6% reduction in inference time. Datasets, source code and
pre-trained models are available at \url{https://github.com/chdwyb/RSFormer}.
Related papers
- Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression [0.8947831206263182]
We present an enhanced adaptive random Fourier features (ARFF) training algorithm for shallow neural networks.
This method uses a particle filter type resampling technique to stabilize the training process and reduce sensitivity to parameter choices.
arXiv Detail & Related papers (2024-10-08T22:08:03Z) - Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching [56.286064975443026]
We make an interesting and somehow surprising observation: the computation of a large proportion of layers in the diffusion transformer, through a caching mechanism, can be readily removed even without updating the model parameters.
We introduce a novel scheme, named Learningto-Cache (L2C), that learns to conduct caching in a dynamic manner for diffusion transformers.
Experimental results show that L2C largely outperforms samplers such as DDIM and DPM-r, alongside prior cache-based methods at the same inference speed.
arXiv Detail & Related papers (2024-06-03T18:49:57Z) - Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining [47.15857899099733]
We develop an end-to-end multi-scale Transformer to facilitate high-quality image reconstruction.
We incorporate intra-scale implicit neural representations based on pixel coordinates with the degraded inputs in a closed-loop design.
Our approach, named as NeRD-Rain, performs favorably against the state-of-the-art ones on both synthetic and real-world benchmark datasets.
arXiv Detail & Related papers (2024-04-02T01:18:16Z) - Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration [46.96362010335177]
In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration.
Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images.
In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM.
arXiv Detail & Related papers (2024-03-30T08:05:00Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - Dynamic PlenOctree for Adaptive Sampling Refinement in Explicit NeRF [6.135925201075925]
We propose the dynamic PlenOctree DOT, which adaptively refines the sample distribution to adjust to changing scene complexity.
Compared with POT, our DOT outperforms it by enhancing visual quality, reducing over $55.15$/$68.84%$ parameters, and providing 1.7/1.9 times FPS for NeRF-synthetic and Tanks $&$ Temples, respectively.
arXiv Detail & Related papers (2023-07-28T06:21:42Z) - Magic ELF: Image Deraining Meets Association Learning and Transformer [63.761812092934576]
This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining.
A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery.
Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
arXiv Detail & Related papers (2022-07-21T12:50:54Z) - Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution.
Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x.
We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Self Sparse Generative Adversarial Networks [73.590634413751]
Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training.
We propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem.
arXiv Detail & Related papers (2021-01-26T04:49:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.