Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
- URL: http://arxiv.org/abs/2409.19403v1
- Date: Sat, 28 Sep 2024 16:33:43 GMT
- Title: Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
- Authors: Chu-Jie Qin, Rui-Qi Wu, Zikun Liu, Xin Lin, Chun-Le Guo, Hyun Hee Park, Chongyi Li,
- Abstract summary: All-in-one image restoration aims to handle multiple degradation types using one model.
This paper proposes a simple pipeline for all-in-one blind image restoration to Restore Anything with Masks (RAM)
- Score: 35.3663995646582
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: All-in-one image restoration aims to handle multiple degradation types using one model. This paper proposes a simple pipeline for all-in-one blind image restoration to Restore Anything with Masks (RAM). We focus on the image content by utilizing Mask Image Modeling to extract intrinsic image information rather than distinguishing degradation types like other methods. Our pipeline consists of two stages: masked image pre-training and fine-tuning with mask attribute conductance. We design a straightforward masking pre-training approach specifically tailored for all-in-one image restoration. This approach enhances networks to prioritize the extraction of image content priors from various degradations, resulting in a more balanced performance across different restoration tasks and achieving stronger overall results. To bridge the gap of input integrity while preserving learned image priors as much as possible, we selectively fine-tuned a small portion of the layers. Specifically, the importance of each layer is ranked by the proposed Mask Attribute Conductance (MAC), and the layers with higher contributions are selected for finetuning. Extensive experiments demonstrate that our method achieves state-of-the-art performance. Our code and model will be released at \href{https://github.com/Dragonisss/RAM}{https://github.com/Dragonisss/RAM}.
Related papers
- Downstream Task Guided Masking Learning in Masked Autoencoders Using
Multi-Level Optimization [42.82742477950748]
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning.
We introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that learns an optimal masking strategy during pretraining.
Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning.
arXiv Detail & Related papers (2024-02-28T07:37:26Z) - Masked Autoencoders are Efficient Class Incremental Learners [64.90846899051164]
Class Incremental Learning (CIL) aims to sequentially learn new classes while avoiding catastrophic forgetting of previous knowledge.
We propose to use Masked Autoencoders (MAEs) as efficient learners for CIL.
arXiv Detail & Related papers (2023-08-24T02:49:30Z) - Not All Image Regions Matter: Masked Vector Quantization for
Autoregressive Image Generation [78.13793505707952]
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
We propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) Stack model from modeling redundancy.
arXiv Detail & Related papers (2023-05-23T02:15:53Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Stare at What You See: Masked Image Modeling without Reconstruction [154.74533119863864]
Masked Autoencoders (MAE) have been prevailing paradigms for large-scale vision representation pre-training.
Recent approaches apply semantic-rich teacher models to extract image features as the reconstruction target, leading to better performance.
We argue the features extracted by powerful teacher models already encode rich semantic correlation across regions in an intact image.
arXiv Detail & Related papers (2022-11-16T12:48:52Z) - Semantic-guided Multi-Mask Image Harmonization [10.27974860479791]
We propose a new semantic-guided multi-mask image harmonization task.
In this work, we propose a novel way to edit the inharmonious images by predicting a series of operator masks.
arXiv Detail & Related papers (2022-07-24T11:48:49Z) - GLaMa: Joint Spatial and Frequency Loss for General Image Inpainting [44.04779984090629]
The purpose of image inpainting is to recover scratches and damaged areas using context information from remaining parts.
We propose a simple yet general method to solve this problem based on the LaMa image inpainting framework, dubbed GLaMa.
Our proposed GLaMa can better capture different types of missing information by using more types of masks.
arXiv Detail & Related papers (2022-05-15T02:18:59Z) - RePaint: Inpainting using Denoising Diffusion Probabilistic Models [161.74792336127345]
Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask.
We propose RePaint: A Denoising Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
We validate our method for both faces and general-purpose image inpainting using standard and extreme masks.
arXiv Detail & Related papers (2022-01-24T18:40:15Z) - Exploiting Deep Generative Prior for Versatile Image Restoration and
Manipulation [181.08127307338654]
This work presents an effective way to exploit the image prior captured by a generative adversarial network (GAN) trained on large-scale natural images.
The deep generative prior (DGP) provides compelling results to restore missing semantics, e.g., color, patch, resolution, of various degraded images.
arXiv Detail & Related papers (2020-03-30T17:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.