TANet: A new Paradigm for Global Face Super-resolution via
Transformer-CNN Aggregation Network
- URL: http://arxiv.org/abs/2109.08174v1
- Date: Thu, 16 Sep 2021 18:15:07 GMT
- Title: TANet: A new Paradigm for Global Face Super-resolution via
Transformer-CNN Aggregation Network
- Authors: Yuanzhi Wang, Tao Lu, Yanduo Zhang, Junjun Jiang, Jiaming Wang,
Zhongyuan Wang, Jiayi Ma
- Abstract summary: We propose a novel paradigm based on the self-attention mechanism (i.e., the core of Transformer) to fully explore the representation capacity of the facial structure feature.
Specifically, we design a Transformer-CNN aggregation network (TANet) consisting of two paths, in which one path uses CNNs responsible for restoring fine-grained facial details.
By aggregating the features from the above two paths, the consistency of global facial structure and fidelity of local facial detail restoration are strengthened simultaneously.
- Score: 72.41798177302175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, face super-resolution (FSR) methods either feed whole face image
into convolutional neural networks (CNNs) or utilize extra facial priors (e.g.,
facial parsing maps, facial landmarks) to focus on facial structure, thereby
maintaining the consistency of the facial structure while restoring facial
details. However, the limited receptive fields of CNNs and inaccurate facial
priors will reduce the naturalness and fidelity of the reconstructed face. In
this paper, we propose a novel paradigm based on the self-attention mechanism
(i.e., the core of Transformer) to fully explore the representation capacity of
the facial structure feature. Specifically, we design a Transformer-CNN
aggregation network (TANet) consisting of two paths, in which one path uses
CNNs responsible for restoring fine-grained facial details while the other
utilizes a resource-friendly Transformer to capture global information by
exploiting the long-distance visual relation modeling. By aggregating the
features from the above two paths, the consistency of global facial structure
and fidelity of local facial detail restoration are strengthened
simultaneously. Experimental results of face reconstruction and recognition
verify that the proposed method can significantly outperform the
state-of-the-art methods.
Related papers
- W-Net: A Facial Feature-Guided Face Super-Resolution Network [8.037821981254389]
Face Super-Resolution aims to recover high-resolution (HR) face images from low-resolution (LR) ones.
Existing approaches are not ideal due to their low reconstruction efficiency and insufficient utilization of prior information.
This paper proposes a novel network architecture called W-Net to address this challenge.
arXiv Detail & Related papers (2024-06-02T09:05:40Z) - Face Super-Resolution with Progressive Embedding of Multi-scale Face
Priors [4.649637261351803]
We propose a novel recurrent convolutional network based framework for face super-resolution.
We take full advantage of the intermediate outputs of the recurrent network, and landmarks information and facial action units (AUs) information are extracted.
Our proposed method significantly outperforms state-of-the-art FSR methods in terms of image quality and facial details restoration.
arXiv Detail & Related papers (2022-10-12T08:16:52Z) - Multi-Prior Learning via Neural Architecture Search for Blind Face
Restoration [61.27907052910136]
Blind Face Restoration (BFR) aims to recover high-quality face images from low-quality ones.
Current methods still suffer from two major difficulties: 1) how to derive a powerful network architecture without extensive hand tuning; 2) how to capture complementary information from multiple facial priors in one network to improve restoration performance.
We propose a Face Restoration Searching Network (FRSNet) to adaptively search the suitable feature extraction architecture within our specified search space.
arXiv Detail & Related papers (2022-06-28T12:29:53Z) - Enhancing Quality of Pose-varied Face Restoration with Local Weak
Feature Sensing and GAN Prior [29.17397958948725]
We propose a well-designed blind face restoration network with generative facial prior.
Our model performs superior to the prior art for face restoration and face super-resolution tasks.
arXiv Detail & Related papers (2022-05-28T09:23:48Z) - CTCNet: A CNN-Transformer Cooperation Network for Face Image
Super-Resolution [64.06360660979138]
We propose an efficient CNN-Transformer Cooperation Network (CTCNet) for face super-resolution tasks.
We first devise a novel Local-Global Feature Cooperation Module (LGCM), which is composed of a Facial Structure Attention Unit (FSAU) and a Transformer block.
We then design an efficient Feature Refinement Module (FRM) to enhance the encoded features.
arXiv Detail & Related papers (2022-04-19T06:38:29Z) - Face Deblurring Based on Separable Normalization and Adaptive
Denormalization [25.506065804812522]
Face deblurring aims to restore a clear face image from a blurred input image with more explicit structure and facial details.
We design an effective face deblurring network based on separable normalization and adaptive denormalization.
Experimental results on both CelebA and CelebA-HQ datasets demonstrate that the proposed face deblurring network restores face structure with more facial details.
arXiv Detail & Related papers (2021-12-18T03:42:23Z) - Face Hallucination via Split-Attention in Split-Attention Network [58.30436379218425]
convolutional neural networks (CNNs) have been widely employed to promote the face hallucination.
We propose a novel external-internal split attention group (ESAG) to take into account the overall facial profile and fine texture details simultaneously.
By fusing the features from these two paths, the consistency of facial structure and the fidelity of facial details are strengthened.
arXiv Detail & Related papers (2020-10-22T10:09:31Z) - DotFAN: A Domain-transferred Face Augmentation Network for Pose and
Illumination Invariant Face Recognition [94.96686189033869]
We propose a 3D model-assisted domain-transferred face augmentation network (DotFAN)
DotFAN can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains.
Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity.
arXiv Detail & Related papers (2020-02-23T08:16:34Z) - Exploiting Semantics for Face Image Deblurring [121.44928934662063]
We propose an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks.
We incorporate face semantic labels as input priors and propose an adaptive structural loss to regularize facial local structures.
The proposed method restores sharp images with more accurate facial features and details.
arXiv Detail & Related papers (2020-01-19T13:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.