Related papers: An Efficient Network Design for Face Video Super-resolution

An Efficient Network Design for Face Video Super-resolution

URL: http://arxiv.org/abs/2109.13626v1
Date: Tue, 28 Sep 2021 11:28:34 GMT
Title: An Efficient Network Design for Face Video Super-resolution
Authors: Feng Yu, He Li, Sige Bian, Yongming Tang
Abstract summary: We construct a dataset consisting entirely of face video sequences for network training and evaluation. We use three combined strategies to optimize the network parameters with a simultaneous train-evaluation method to accelerate optimization process. The generated network can reduce at least 52.4% parameters and 20.7% FLOPs, achieve better performance on PSNR, SSIM.
Score: 3.950210498382569
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Face video super-resolution algorithm aims to reconstruct realistic face details through continuous input video sequences. However, existing video processing algorithms usually contain redundant parameters to guarantee different super-resolution scenes. In this work, we focus on super-resolution of face areas in original video scenes, while rest areas are interpolated. This specific super-resolved task makes it possible to cut redundant parameters in general video super-resolution networks. We construct a dataset consisting entirely of face video sequences for network training and evaluation, and conduct hyper-parameter optimization in our experiments. We use three combined strategies to optimize the network parameters with a simultaneous train-evaluation method to accelerate optimization process. Results show that simultaneous train-evaluation method improves the training speed and facilitates the generation of efficient networks. The generated network can reduce at least 52.4% parameters and 20.7% FLOPs, achieve better performance on PSNR, SSIM compared with state-of-art video super-resolution algorithms. When processing 36x36x1x3 input video frame sequences, the efficient network provides 47.62 FPS real-time processing performance. We name our proposal as hyper-parameter optimization for face Video Super-Resolution (HO-FVSR), which is open-sourced at https://github.com/yphone/efficient-network-for-face-VSR.

Related papers

RepNet-VSR: Reparameterizable Architecture for High-Fidelity Video Super-Resolution [12.274092278786966]
We propose a Reizable Architecture for High Fidelity Video Super Resolution method, named RepNet-VSR, for real-time 4x video super-resolution. The proposed model achieves 27.79 dB PSNR when processing 180p to 720p frames in 103 ms per 10 frames on a MediaTek Dimensity NPU.
arXiv Detail & Related papers (2025-04-22T07:15:07Z)
Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution [14.135298731079164]
We propose a novel Scale-Adaptive Feature Aggregation (SAFA) network that adaptively selects sub-networks with different processing scales for individual samples. Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.
arXiv Detail & Related papers (2023-10-26T10:18:51Z)
Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism. We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features. We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z)
Deep Unsupervised Key Frame Extraction for Efficient Video Classification [63.25852915237032]
This work presents an unsupervised method to retrieve the key frames, which combines Convolutional Neural Network (CNN) and Temporal Segment Density Peaks Clustering (TSDPC) The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically. Furthermore, a Long Short-Term Memory network (LSTM) is added on the top of the CNN to further elevate the performance of classification.
arXiv Detail & Related papers (2022-11-12T20:45:35Z)
Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR) Most deep CNN-based SR models take massive computations to obtain high performance. We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z)
Real-Time Super-Resolution System of 4K-Video Based on Deep Learning [6.182364004551161]
Video-resolution (VSR) technology excels in low-quality video computation, avoiding unpleasant blur effect caused by occupation-based algorithms. This paper explores the possibility of real-time VS system and designs an efficient generic VSR network, termed EGVSR. Compared with TecoGAN, the most advanced VSR network at present, we achieve 84% reduction of density and 7.92x performance speedups.
arXiv Detail & Related papers (2021-07-12T10:35:05Z)
Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling [15.630742638440998]
We present two joint optimization approaches based on invertible neural networks with coupling layers. Our Long Short-Term Memory Video Rescaling Network (LSTM-VRN) leverages temporal information in the low-resolution video to form an explicit prediction of the missing high-frequency information for upscaling. Our Multi-input Multi-output Video Rescaling Network (MIMO-VRN) proposes a new strategy for downscaling and upscaling a group of video frames simultaneously.
arXiv Detail & Related papers (2021-03-27T09:35:38Z)
A Deep-Unfolded Reference-Based RPCA Network For Video Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA) Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames. Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z)
Deep Space-Time Video Upsampling Networks [47.62807427163614]
Video super-resolution (VSR) and frame (FI) are traditional computer vision problems. We propose an end-to-end framework for the space-time video upsampling by efficiently merging VSR and FI into a joint framework. Results show better results both quantitatively and qualitatively, while reducing the time (x7 faster) and the number of parameters (30%) compared to baselines.
arXiv Detail & Related papers (2020-04-06T07:04:21Z)
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR) temporalsynthesis and spatial super-resolution are intra-related in this task. We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
Video Face Super-Resolution with Motion-Adaptive Feedback Cell [90.73821618795512]
Video super-resolution (VSR) methods have recently achieved a remarkable success due to the development of deep convolutional neural networks (CNN) In this paper, we propose a Motion-Adaptive Feedback Cell (MAFC), a simple but effective block, which can efficiently capture the motion compensation and feed it back to the network in an adaptive way.
arXiv Detail & Related papers (2020-02-15T13:14:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.