LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight
Snow Removal
- URL: http://arxiv.org/abs/2210.04787v4
- Date: Thu, 6 Apr 2023 03:39:27 GMT
- Title: LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight
Snow Removal
- Authors: Junhong Lin, Nanfeng Jiang, Zhentao Zhang, Weiling Chen and Tiesong
Zhao
- Abstract summary: We propose a lightweight but high-efficient snow removal network called Laplace Mask Query Transformer (LMQFormer)
Firstly, we present a Laplace-VQVAE to generate a coarse mask as prior knowledge of snow. Instead of using the mask in dataset, we aim at reducing both the information entropy of snow and the computational cost of recovery.
Thirdly, we develop a Duplicated Mask Query Attention (DMQA) that converts the coarse mask into a specific number of queries, which constraint the attention areas of MQFormer with reduced parameters.
- Score: 22.047433543495867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Snow removal aims to locate snow areas and recover clean images without
repairing traces. Unlike the regularity and semitransparency of rain, snow with
various patterns and degradations seriously occludes the background. As a
result, the state-of-the-art snow removal methods usually retains a large
parameter size. In this paper, we propose a lightweight but high-efficient snow
removal network called Laplace Mask Query Transformer (LMQFormer). Firstly, we
present a Laplace-VQVAE to generate a coarse mask as prior knowledge of snow.
Instead of using the mask in dataset, we aim at reducing both the information
entropy of snow and the computational cost of recovery. Secondly, we design a
Mask Query Transformer (MQFormer) to remove snow with the coarse mask, where we
use two parallel encoders and a hybrid decoder to learn extensive snow features
under lightweight requirements. Thirdly, we develop a Duplicated Mask Query
Attention (DMQA) that converts the coarse mask into a specific number of
queries, which constraint the attention areas of MQFormer with reduced
parameters. Experimental results in popular datasets have demonstrated the
efficiency of our proposed model, which achieves the state-of-the-art snow
removal quality with significantly reduced parameters and the lowest running
time.
Related papers
- Toward a Deeper Understanding: RetNet Viewed through Convolution [25.8904146140577]
Vision Transformer (ViT) can learn global dependencies superior to CNN, yet CNN's inherent locality can substitute for expensive training resources.
This paper investigates the effectiveness of RetNet from a CNN perspective and presents a variant of RetNet tailored to the visual domain.
We propose a novel Gaussian mixture mask (GMM) in which one mask only has two learnable parameters and it can be conveniently used in any ViT variants whose attention mechanism allows the use of masks.
arXiv Detail & Related papers (2023-09-11T10:54:22Z) - Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained
Vision-Language Models [89.07925369856139]
We design a new type of tuning method, termed as regularized mask tuning, which masks the network parameters through a learnable selection.
Inspired by neural pathways, we argue that the knowledge required by a downstream task already exists in the pre-trained weights but just gets concealed in the upstream pre-training stage.
It is noteworthy that we manage to deliver 18.73% performance improvement compared to the zero-shot CLIP via masking an average of only 2.56% parameters.
arXiv Detail & Related papers (2023-07-27T17:56:05Z) - MP-Former: Mask-Piloted Transformer for Image Segmentation [16.620469868310288]
Mask2Former suffers from inconsistent mask predictions between decoder layers.
We propose a mask-piloted training approach, which feeds noised ground-truth masks in masked-attention and trains the model to reconstruct the original ones.
arXiv Detail & Related papers (2023-03-13T17:57:59Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - MSP-Former: Multi-Scale Projection Transformer for Single Image
Desnowing [6.22867695581195]
We apply the vision transformer to the task of snow removal from a single image.
We propose a parallel network architecture split along the channel, performing local feature refinement and global information modeling separately.
In the experimental part, we conduct extensive experiments to demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-07-12T15:44:07Z) - Snow Mask Guided Adaptive Residual Network for Image Snow Removal [21.228758052455273]
Snow is an extremely common atmospheric phenomenon that will seriously affect the performance of high-level computer vision tasks.
We propose a Snow Mask Guided Adaptive Residual Network (SMGARN)
It consists of three parts, Mask-Net, Guidance-Fusion Network (GF-Net), and Reconstruct-Net.
Our SMGARN numerically outperforms all existing snow removal methods, and the reconstructed images are clearer in visual contrast.
arXiv Detail & Related papers (2022-07-11T10:30:46Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - RePaint: Inpainting using Denoising Diffusion Probabilistic Models [161.74792336127345]
Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask.
We propose RePaint: A Denoising Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
We validate our method for both faces and general-purpose image inpainting using standard and extreme masks.
arXiv Detail & Related papers (2022-01-24T18:40:15Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z) - DCT-Mask: Discrete Cosine Transform Mask Representation for Instance
Segmentation [50.70679435176346]
We propose a new mask representation by applying the discrete cosine transform(DCT) to encode the high-resolution binary grid mask into a compact vector.
Our method, termed DCT-Mask, could be easily integrated into most pixel-based instance segmentation methods.
arXiv Detail & Related papers (2020-11-19T15:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.