Resolution-robust Large Mask Inpainting with Fourier Convolutions
- URL: http://arxiv.org/abs/2109.07161v1
- Date: Wed, 15 Sep 2021 08:54:29 GMT
- Title: Resolution-robust Large Mask Inpainting with Fourier Convolutions
- Authors: Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia
Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka,
Kiwoong Park, Victor Lempitsky
- Abstract summary: Inpainting systems often struggle with large missing areas, complex geometric structures, and high-resolution images.
We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function.
We propose a new method called large mask inpainting (LaMa) to alleviate this issue.
- Score: 10.152370311844445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern image inpainting systems, despite the significant progress, often
struggle with large missing areas, complex geometric structures, and
high-resolution images. We find that one of the main reasons for that is the
lack of an effective receptive field in both the inpainting network and the
loss function. To alleviate this issue, we propose a new method called large
mask inpainting (LaMa). LaMa is based on i) a new inpainting network
architecture that uses fast Fourier convolutions, which have the image-wide
receptive field; ii) a high receptive field perceptual loss; and iii) large
training masks, which unlocks the potential of the first two components. Our
inpainting network improves the state-of-the-art across a range of datasets and
achieves excellent performance even in challenging scenarios, e.g. completion
of periodic structures. Our model generalizes surprisingly well to resolutions
that are higher than those seen at train time, and achieves this at lower
parameter&compute costs than the competitive baselines. The code is available
at https://github.com/saic-mdal/lama.
Related papers
- T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Learning Prior Feature and Attention Enhanced Image Inpainting [63.21231753407192]
This paper incorporates the pre-training based Masked AutoEncoder (MAE) into the inpainting model.
We propose to use attention priors from MAE to make the inpainting model learn more long-distance dependencies between masked and unmasked regions.
arXiv Detail & Related papers (2022-08-03T04:32:53Z) - Feature Refinement to Improve High Resolution Image Inpainting [1.4824891788575418]
Inpainting networks are often unable to generate globally coherent structures at resolutions higher than their training set.
We optimize the intermediate featuremaps of a network by minimizing a multiscale consistency loss at inference.
This runtime optimization improves the inpainting results and establishes a new state-of-the-art for high resolution inpainting.
arXiv Detail & Related papers (2022-06-27T21:59:12Z) - GLaMa: Joint Spatial and Frequency Loss for General Image Inpainting [44.04779984090629]
The purpose of image inpainting is to recover scratches and damaged areas using context information from remaining parts.
We propose a simple yet general method to solve this problem based on the LaMa image inpainting framework, dubbed GLaMa.
Our proposed GLaMa can better capture different types of missing information by using more types of masks.
arXiv Detail & Related papers (2022-05-15T02:18:59Z) - MAT: Mask-Aware Transformer for Large Hole Image Inpainting [79.67039090195527]
We present a novel model for large hole inpainting, which unifies the merits of transformers and convolutions.
Experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets.
arXiv Detail & Related papers (2022-03-29T06:36:17Z) - Incremental Transformer Structure Enhanced Image Inpainting with Masking
Positional Encoding [38.014569953980754]
The proposed model restores holistic image structures with a powerful attention-based transformer model in a fixed low-resolution sketch space.
Our model can be integrated with other pretrained inpainting models efficiently with the zero-d residual addition.
arXiv Detail & Related papers (2022-03-02T04:27:27Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Very Long Natural Scenery Image Prediction by Outpainting [96.8509015981031]
Outpainting receives less attention due to two challenges in it.
First challenge is how to keep the spatial and content consistency between generated images and original input.
Second challenge is how to maintain high quality in generated results.
arXiv Detail & Related papers (2019-12-29T16:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.