LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight
Snow Removal
- URL: http://arxiv.org/abs/2210.04787v4
- Date: Thu, 6 Apr 2023 03:39:27 GMT
- Title: LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight
Snow Removal
- Authors: Junhong Lin, Nanfeng Jiang, Zhentao Zhang, Weiling Chen and Tiesong
Zhao
- Abstract summary: We propose a lightweight but high-efficient snow removal network called Laplace Mask Query Transformer (LMQFormer)
Firstly, we present a Laplace-VQVAE to generate a coarse mask as prior knowledge of snow. Instead of using the mask in dataset, we aim at reducing both the information entropy of snow and the computational cost of recovery.
Thirdly, we develop a Duplicated Mask Query Attention (DMQA) that converts the coarse mask into a specific number of queries, which constraint the attention areas of MQFormer with reduced parameters.
- Score: 22.047433543495867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Snow removal aims to locate snow areas and recover clean images without
repairing traces. Unlike the regularity and semitransparency of rain, snow with
various patterns and degradations seriously occludes the background. As a
result, the state-of-the-art snow removal methods usually retains a large
parameter size. In this paper, we propose a lightweight but high-efficient snow
removal network called Laplace Mask Query Transformer (LMQFormer). Firstly, we
present a Laplace-VQVAE to generate a coarse mask as prior knowledge of snow.
Instead of using the mask in dataset, we aim at reducing both the information
entropy of snow and the computational cost of recovery. Secondly, we design a
Mask Query Transformer (MQFormer) to remove snow with the coarse mask, where we
use two parallel encoders and a hybrid decoder to learn extensive snow features
under lightweight requirements. Thirdly, we develop a Duplicated Mask Query
Attention (DMQA) that converts the coarse mask into a specific number of
queries, which constraint the attention areas of MQFormer with reduced
parameters. Experimental results in popular datasets have demonstrated the
efficiency of our proposed model, which achieves the state-of-the-art snow
removal quality with significantly reduced parameters and the lowest running
time.
Related papers
- MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models [91.4190318047519]
This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or N:M'') Sparsity in Large Language Models.
MaskLLM explicitly models N:M patterns as a learnable distribution through Gumbel Softmax sampling.
arXiv Detail & Related papers (2024-09-26T02:37:41Z) - MaskInversion: Localized Embeddings via Optimization of Explainability Maps [49.50785637749757]
MaskInversion generates a context-aware embedding for a query image region specified by a mask at test time.
It can be used for a broad range of tasks, including open-vocabulary class retrieval, referring expression comprehension, as well as for localized captioning and image generation.
arXiv Detail & Related papers (2024-07-29T14:21:07Z) - Toward a Deeper Understanding: RetNet Viewed through Convolution [25.8904146140577]
Vision Transformer (ViT) can learn global dependencies superior to CNN, yet CNN's inherent locality can substitute for expensive training resources.
This paper investigates the effectiveness of RetNet from a CNN perspective and presents a variant of RetNet tailored to the visual domain.
We propose a novel Gaussian mixture mask (GMM) in which one mask only has two learnable parameters and it can be conveniently used in any ViT variants whose attention mechanism allows the use of masks.
arXiv Detail & Related papers (2023-09-11T10:54:22Z) - MP-Former: Mask-Piloted Transformer for Image Segmentation [16.620469868310288]
Mask2Former suffers from inconsistent mask predictions between decoder layers.
We propose a mask-piloted training approach, which feeds noised ground-truth masks in masked-attention and trains the model to reconstruct the original ones.
arXiv Detail & Related papers (2023-03-13T17:57:59Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - MSP-Former: Multi-Scale Projection Transformer for Single Image
Desnowing [6.22867695581195]
We apply the vision transformer to the task of snow removal from a single image.
We propose a parallel network architecture split along the channel, performing local feature refinement and global information modeling separately.
In the experimental part, we conduct extensive experiments to demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-07-12T15:44:07Z) - Snow Mask Guided Adaptive Residual Network for Image Snow Removal [21.228758052455273]
Snow is an extremely common atmospheric phenomenon that will seriously affect the performance of high-level computer vision tasks.
We propose a Snow Mask Guided Adaptive Residual Network (SMGARN)
It consists of three parts, Mask-Net, Guidance-Fusion Network (GF-Net), and Reconstruct-Net.
Our SMGARN numerically outperforms all existing snow removal methods, and the reconstructed images are clearer in visual contrast.
arXiv Detail & Related papers (2022-07-11T10:30:46Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - RePaint: Inpainting using Denoising Diffusion Probabilistic Models [161.74792336127345]
Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask.
We propose RePaint: A Denoising Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
We validate our method for both faces and general-purpose image inpainting using standard and extreme masks.
arXiv Detail & Related papers (2022-01-24T18:40:15Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.