CTCNet: A CNN-Transformer Cooperation Network for Face Image
Super-Resolution
- URL: http://arxiv.org/abs/2204.08696v3
- Date: Thu, 23 Mar 2023 09:44:22 GMT
- Title: CTCNet: A CNN-Transformer Cooperation Network for Face Image
Super-Resolution
- Authors: Guangwei Gao, Zixiang Xu, Juncheng Li, Jian Yang, Tieyong Zeng and
Guo-Jun Qi
- Abstract summary: We propose an efficient CNN-Transformer Cooperation Network (CTCNet) for face super-resolution tasks.
We first devise a novel Local-Global Feature Cooperation Module (LGCM), which is composed of a Facial Structure Attention Unit (FSAU) and a Transformer block.
We then design an efficient Feature Refinement Module (FRM) to enhance the encoded features.
- Score: 64.06360660979138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, deep convolution neural networks (CNNs) steered face
super-resolution methods have achieved great progress in restoring degraded
facial details by jointly training with facial priors. However, these methods
have some obvious limitations. On the one hand, multi-task joint learning
requires additional marking on the dataset, and the introduced prior network
will significantly increase the computational cost of the model. On the other
hand, the limited receptive field of CNN will reduce the fidelity and
naturalness of the reconstructed facial images, resulting in suboptimal
reconstructed images. In this work, we propose an efficient CNN-Transformer
Cooperation Network (CTCNet) for face super-resolution tasks, which uses the
multi-scale connected encoder-decoder architecture as the backbone.
Specifically, we first devise a novel Local-Global Feature Cooperation Module
(LGCM), which is composed of a Facial Structure Attention Unit (FSAU) and a
Transformer block, to promote the consistency of local facial detail and global
facial structure restoration simultaneously. Then, we design an efficient
Feature Refinement Module (FRM) to enhance the encoded features. Finally, to
further improve the restoration of fine facial details, we present a
Multi-scale Feature Fusion Unit (MFFU) to adaptively fuse the features from
different stages in the encoder procedure. Extensive evaluations on various
datasets have assessed that the proposed CTCNet can outperform other
state-of-the-art methods significantly. Source code will be available at
https://github.com/IVIPLab/CTCNet.
Related papers
- W-Net: A Facial Feature-Guided Face Super-Resolution Network [8.037821981254389]
Face Super-Resolution aims to recover high-resolution (HR) face images from low-resolution (LR) ones.
Existing approaches are not ideal due to their low reconstruction efficiency and insufficient utilization of prior information.
This paper proposes a novel network architecture called W-Net to address this challenge.
arXiv Detail & Related papers (2024-06-02T09:05:40Z) - Multiscale Low-Frequency Memory Network for Improved Feature Extraction
in Convolutional Neural Networks [13.815116154370834]
We introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network.
The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks.
Our work builds upon the existing CNN foundations and paves the way for future advancements in computer vision.
arXiv Detail & Related papers (2024-03-13T00:48:41Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Multi-Prior Learning via Neural Architecture Search for Blind Face
Restoration [61.27907052910136]
Blind Face Restoration (BFR) aims to recover high-quality face images from low-quality ones.
Current methods still suffer from two major difficulties: 1) how to derive a powerful network architecture without extensive hand tuning; 2) how to capture complementary information from multiple facial priors in one network to improve restoration performance.
We propose a Face Restoration Searching Network (FRSNet) to adaptively search the suitable feature extraction architecture within our specified search space.
arXiv Detail & Related papers (2022-06-28T12:29:53Z) - Lightweight Bimodal Network for Single-Image Super-Resolution via
Symmetric CNN and Recursive Transformer [27.51790638626891]
Single-image super-resolution (SISR) has achieved significant breakthroughs with the development of deep learning.
To solve this issue, we propose a Lightweight Bimodal Network (LBNet) for SISR.
Specifically, an effective Symmetric CNN is designed for local feature extraction and coarse image reconstruction.
arXiv Detail & Related papers (2022-04-28T04:43:22Z) - TANet: A new Paradigm for Global Face Super-resolution via
Transformer-CNN Aggregation Network [72.41798177302175]
We propose a novel paradigm based on the self-attention mechanism (i.e., the core of Transformer) to fully explore the representation capacity of the facial structure feature.
Specifically, we design a Transformer-CNN aggregation network (TANet) consisting of two paths, in which one path uses CNNs responsible for restoring fine-grained facial details.
By aggregating the features from the above two paths, the consistency of global facial structure and fidelity of local facial detail restoration are strengthened simultaneously.
arXiv Detail & Related papers (2021-09-16T18:15:07Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.