GVTNet: Graph Vision Transformer For Face Super-Resolution
- URL: http://arxiv.org/abs/2502.12570v1
- Date: Tue, 18 Feb 2025 06:15:02 GMT
- Title: GVTNet: Graph Vision Transformer For Face Super-Resolution
- Authors: Chao Yang, Yong Fan, Cheng Lu, Minghao Yuan, Zhijing Yang,
- Abstract summary: We propose a transformer architecture based on graph neural networks called graph vision transformer network.
We treat each patch as a graph node and establish an adjacency matrix based on the information between patches.
In this way, the patch only interacts between neighboring patches, further processing the relationship of facial components.
- Score: 9.27272284458893
- License:
- Abstract: Recent advances in face super-resolution research have utilized the Transformer architecture. This method processes the input image into a series of small patches. However, because of the strong correlation between different facial components in facial images. When it comes to super-resolution of low-resolution images, existing algorithms cannot handle the relationships between patches well, resulting in distorted facial components in the super-resolution results. To solve the problem, we propose a transformer architecture based on graph neural networks called graph vision transformer network. We treat each patch as a graph node and establish an adjacency matrix based on the information between patches. In this way, the patch only interacts between neighboring patches, further processing the relationship of facial components. Quantitative and visualization experiments have underscored the superiority of our algorithm over state-of-the-art techniques. Through detailed comparisons, we have demonstrated that our algorithm possesses more advanced super-resolution capabilities, particularly in enhancing facial components. The PyTorch code is available at https://github.com/continueyang/GVTNet
Related papers
- Transformer based Pluralistic Image Completion with Reduced Information Loss [72.92754600354199]
Transformer based methods have achieved great success in image inpainting recently.
They regard each pixel as a token, thus suffering from an information loss issue.
We propose a new transformer based framework called "PUT"
arXiv Detail & Related papers (2024-03-31T01:20:16Z) - Learning from small data sets: Patch-based regularizers in inverse
problems for image reconstruction [1.1650821883155187]
Recent advances in machine learning require a huge amount of data and computer capacity to train the networks.
Our paper addresses the issue of learning from small data sets by taking patches of very few images into account.
We show how we can achieve uncertainty quantification by approximating the posterior using Langevin Monte Carlo methods.
arXiv Detail & Related papers (2023-12-27T15:30:05Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - Graph Reasoning Transformer for Image Parsing [67.76633142645284]
We propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern.
Compared to the conventional transformer, GReaT has higher interaction efficiency and a more purposeful interaction pattern.
Results show that GReaT achieves consistent performance gains with slight computational overheads on the state-of-the-art transformer baselines.
arXiv Detail & Related papers (2022-09-20T08:21:37Z) - HIPA: Hierarchical Patch Transformer for Single Image Super Resolution [62.7081074931892]
This paper presents HIPA, a novel Transformer architecture that progressively recovers the high resolution image using a hierarchical patch partition.
We build a cascaded model that processes an input image in multiple stages, where we start with tokens with small patch sizes and gradually merge to the full resolution.
Such a hierarchical patch mechanism not only explicitly enables feature aggregation at multiple resolutions but also adaptively learns patch-aware features for different image regions.
arXiv Detail & Related papers (2022-03-19T05:09:34Z) - A new face swap method for image and video domains: a technical report [60.47144478048589]
We introduce a new face swap pipeline that is based on FaceShifter architecture.
New eye loss function, super-resolution block, and Gaussian-based face mask generation leads to improvements in quality.
arXiv Detail & Related papers (2022-02-07T10:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.