Patch-Based Stochastic Attention for Image Editing
- URL: http://arxiv.org/abs/2202.03163v4
- Date: Wed, 1 Nov 2023 09:35:34 GMT
- Title: Patch-Based Stochastic Attention for Image Editing
- Authors: Nicolas Cherel, Andr\'es Almansa, Yann Gousseau, Alasdair Newson
- Abstract summary: We propose an efficient attention layer based on the algorithm PatchMatch, which is used for determining approximate nearest neighbors.
We demonstrate the usefulness of PSAL on several image editing tasks, such as image inpainting, guided image colorization, and single-image super-resolution.
- Score: 4.8201607588546
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Attention mechanisms have become of crucial importance in deep learning in
recent years. These non-local operations, which are similar to traditional
patch-based methods in image processing, complement local convolutions.
However, computing the full attention matrix is an expensive step with heavy
memory and computational loads. These limitations curb network architectures
and performances, in particular for the case of high resolution images. We
propose an efficient attention layer based on the stochastic algorithm
PatchMatch, which is used for determining approximate nearest neighbors. We
refer to our proposed layer as a "Patch-based Stochastic Attention Layer"
(PSAL). Furthermore, we propose different approaches, based on patch
aggregation, to ensure the differentiability of PSAL, thus allowing end-to-end
training of any network containing our layer. PSAL has a small memory footprint
and can therefore scale to high resolution images. It maintains this footprint
without sacrificing spatial precision and globality of the nearest neighbors,
which means that it can be easily inserted in any level of a deep architecture,
even in shallower levels. We demonstrate the usefulness of PSAL on several
image editing tasks, such as image inpainting, guided image colorization, and
single-image super-resolution. Our code is available at:
https://github.com/ncherel/psal
Related papers
- Learning to Rank Patches for Unbiased Image Redundancy Reduction [80.93989115541966]
Images suffer from heavy spatial redundancy because pixels in neighboring regions are spatially correlated.
Existing approaches strive to overcome this limitation by reducing less meaningful image regions.
We propose a self-supervised framework for image redundancy reduction called Learning to Rank Patches.
arXiv Detail & Related papers (2024-03-31T13:12:41Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - DBAT: Dynamic Backward Attention Transformer for Material Segmentation
with Cross-Resolution Patches [8.812837829361923]
We propose the Dynamic Backward Attention Transformer (DBAT) to aggregate cross-resolution features.
Experiments show that our DBAT achieves an accuracy of 86.85%, which is the best performance among state-of-the-art real-time models.
We further align features to semantic labels, performing network dissection, to infer that the proposed model can extract material-related features better than other methods.
arXiv Detail & Related papers (2023-05-06T03:47:20Z) - PATS: Patch Area Transportation with Subdivision for Local Feature
Matching [78.67559513308787]
Local feature matching aims at establishing sparse correspondences between a pair of images.
We propose Patch Area Transportation with Subdivision (PATS) to tackle this issue.
PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks.
arXiv Detail & Related papers (2023-03-14T08:28:36Z) - From Coarse to Fine: Hierarchical Pixel Integration for Lightweight
Image Super-Resolution [41.0555613285837]
Transformer-based models have achieved competitive performances in image super-resolution (SR)
We propose a new attention block whose insights are from the interpretation of Local Map (LAM) for SR networks.
In the fine area, we use an Intra-Patch Self-Attention Attribution (IPSA) module to model long-range pixel dependencies in a local patch.
arXiv Detail & Related papers (2022-11-30T06:32:34Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - HIPA: Hierarchical Patch Transformer for Single Image Super Resolution [62.7081074931892]
This paper presents HIPA, a novel Transformer architecture that progressively recovers the high resolution image using a hierarchical patch partition.
We build a cascaded model that processes an input image in multiple stages, where we start with tokens with small patch sizes and gradually merge to the full resolution.
Such a hierarchical patch mechanism not only explicitly enables feature aggregation at multiple resolutions but also adaptively learns patch-aware features for different image regions.
arXiv Detail & Related papers (2022-03-19T05:09:34Z) - Texture Memory-Augmented Deep Patch-Based Image Inpainting [121.41395272974611]
We propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.
The framework has a novel design that allows texture memory retrieval to be trained end-to-end with the deep inpainting network.
The proposed method shows superior performance both qualitatively and quantitatively on three challenging image benchmarks.
arXiv Detail & Related papers (2020-09-28T12:09:08Z) - High-Resolution Deep Image Matting [39.72708676319803]
HDMatt is a first deep learning based image matting approach for high-resolution inputs.
Our proposed method sets new state-of-the-art performance on Adobe Image Matting and AlphaMatting benchmarks.
arXiv Detail & Related papers (2020-09-14T17:53:15Z) - PNEN: Pyramid Non-Local Enhanced Networks [23.17149002568982]
We propose a novel non-local module, Pyramid Non-local Block, to build up connection between every pixel and all remain pixels.
Based on the proposed module, we devise a Pyramid Non-local Enhanced Networks for edge-preserving image smoothing.
We integrate it into two existing methods for image denoising and single image super-resolution, achieving consistently improved performance.
arXiv Detail & Related papers (2020-08-22T03:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.