T-former: An Efficient Transformer for Image Inpainting
- URL: http://arxiv.org/abs/2305.07239v2
- Date: Fri, 19 May 2023 02:11:54 GMT
- Title: T-former: An Efficient Transformer for Image Inpainting
- Authors: Ye Deng, Siqi Hui, Sanping Zhou, Deyu Meng, Jinjun Wang
- Abstract summary: A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
- Score: 50.43302925662507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Benefiting from powerful convolutional neural networks (CNNs), learning-based
image inpainting methods have made significant breakthroughs over the years.
However, some nature of CNNs (e.g. local prior, spatially shared parameters)
limit the performance in the face of broken images with diverse and complex
forms. Recently, a class of attention-based network architectures, called
transformer, has shown significant performance on natural language processing
fields and high-level vision tasks. Compared with CNNs, attention operators are
better at long-range modeling and have dynamic weights, but their computational
complexity is quadratic in spatial resolution, and thus less suitable for
applications involving higher resolution images, such as image inpainting. In
this paper, we design a novel attention linearly related to the resolution
according to Taylor expansion. And based on this attention, a network called
$T$-former is designed for image inpainting. Experiments on several benchmark
datasets demonstrate that our proposed method achieves state-of-the-art
accuracy while maintaining a relatively low number of parameters and
computational complexity. The code can be found at
\href{https://github.com/dengyecode/T-former_image_inpainting}{github.com/dengyecode/T-former\_image\_inpainting}
Related papers
- Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - CNNs for JPEGs: A Study in Computational Cost [45.74830585715129]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2023-09-20T15:49:38Z) - WavePaint: Resource-efficient Token-mixer for Self-supervised Inpainting [2.3014300466616078]
This paper diverges from vision transformers by using a computationally-efficient WaveMix-based fully convolutional architecture -- WavePaint.
It uses a 2D-discrete wavelet transform (DWT) for spatial and multi-resolution token-mixing along with convolutional layers.
Our model even outperforms current GAN-based architectures in CelebA-HQ dataset without using an adversarially trainable discriminator.
arXiv Detail & Related papers (2023-07-01T18:41:34Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - MAT: Mask-Aware Transformer for Large Hole Image Inpainting [79.67039090195527]
We present a novel model for large hole inpainting, which unifies the merits of transformers and convolutions.
Experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets.
arXiv Detail & Related papers (2022-03-29T06:36:17Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.