An Efficient Network Design for Face Video Super-resolution
- URL: http://arxiv.org/abs/2109.13626v1
- Date: Tue, 28 Sep 2021 11:28:34 GMT
- Title: An Efficient Network Design for Face Video Super-resolution
- Authors: Feng Yu, He Li, Sige Bian, Yongming Tang
- Abstract summary: We construct a dataset consisting entirely of face video sequences for network training and evaluation.
We use three combined strategies to optimize the network parameters with a simultaneous train-evaluation method to accelerate optimization process.
The generated network can reduce at least 52.4% parameters and 20.7% FLOPs, achieve better performance on PSNR, SSIM.
- Score: 3.950210498382569
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Face video super-resolution algorithm aims to reconstruct realistic face
details through continuous input video sequences. However, existing video
processing algorithms usually contain redundant parameters to guarantee
different super-resolution scenes. In this work, we focus on super-resolution
of face areas in original video scenes, while rest areas are interpolated. This
specific super-resolved task makes it possible to cut redundant parameters in
general video super-resolution networks. We construct a dataset consisting
entirely of face video sequences for network training and evaluation, and
conduct hyper-parameter optimization in our experiments. We use three combined
strategies to optimize the network parameters with a simultaneous
train-evaluation method to accelerate optimization process. Results show that
simultaneous train-evaluation method improves the training speed and
facilitates the generation of efficient networks. The generated network can
reduce at least 52.4% parameters and 20.7% FLOPs, achieve better performance on
PSNR, SSIM compared with state-of-art video super-resolution algorithms. When
processing 36x36x1x3 input video frame sequences, the efficient network
provides 47.62 FPS real-time processing performance. We name our proposal as
hyper-parameter optimization for face Video Super-Resolution (HO-FVSR), which
is open-sourced at https://github.com/yphone/efficient-network-for-face-VSR.
Related papers
- Scale-Adaptive Feature Aggregation for Efficient Space-Time Video
Super-Resolution [14.135298731079164]
We propose a novel Scale-Adaptive Feature Aggregation (SAFA) network that adaptively selects sub-networks with different processing scales for individual samples.
Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.
arXiv Detail & Related papers (2023-10-26T10:18:51Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Deep Unsupervised Key Frame Extraction for Efficient Video
Classification [63.25852915237032]
This work presents an unsupervised method to retrieve the key frames, which combines Convolutional Neural Network (CNN) and Temporal Segment Density Peaks Clustering (TSDPC)
The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically.
Furthermore, a Long Short-Term Memory network (LSTM) is added on the top of the CNN to further elevate the performance of classification.
arXiv Detail & Related papers (2022-11-12T20:45:35Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - Real-Time Super-Resolution System of 4K-Video Based on Deep Learning [6.182364004551161]
Video-resolution (VSR) technology excels in low-quality video computation, avoiding unpleasant blur effect caused by occupation-based algorithms.
This paper explores the possibility of real-time VS system and designs an efficient generic VSR network, termed EGVSR.
Compared with TecoGAN, the most advanced VSR network at present, we achieve 84% reduction of density and 7.92x performance speedups.
arXiv Detail & Related papers (2021-07-12T10:35:05Z) - Video Rescaling Networks with Joint Optimization Strategies for
Downscaling and Upscaling [15.630742638440998]
We present two joint optimization approaches based on invertible neural networks with coupling layers.
Our Long Short-Term Memory Video Rescaling Network (LSTM-VRN) leverages temporal information in the low-resolution video to form an explicit prediction of the missing high-frequency information for upscaling.
Our Multi-input Multi-output Video Rescaling Network (MIMO-VRN) proposes a new strategy for downscaling and upscaling a group of video frames simultaneously.
arXiv Detail & Related papers (2021-03-27T09:35:38Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - Deep Space-Time Video Upsampling Networks [47.62807427163614]
Video super-resolution (VSR) and frame (FI) are traditional computer vision problems.
We propose an end-to-end framework for the space-time video upsampling by efficiently merging VSR and FI into a joint framework.
Results show better results both quantitatively and qualitatively, while reducing the time (x7 faster) and the number of parameters (30%) compared to baselines.
arXiv Detail & Related papers (2020-04-06T07:04:21Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z) - Video Face Super-Resolution with Motion-Adaptive Feedback Cell [90.73821618795512]
Video super-resolution (VSR) methods have recently achieved a remarkable success due to the development of deep convolutional neural networks (CNN)
In this paper, we propose a Motion-Adaptive Feedback Cell (MAFC), a simple but effective block, which can efficiently capture the motion compensation and feed it back to the network in an adaptive way.
arXiv Detail & Related papers (2020-02-15T13:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.