Perceptually-inspired super-resolution of compressed videos
- URL: http://arxiv.org/abs/2106.08147v1
- Date: Tue, 15 Jun 2021 13:50:24 GMT
- Title: Perceptually-inspired super-resolution of compressed videos
- Authors: Di Ma, Mariana Afonso, Fan Zhang and David R. Bull
- Abstract summary: spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency.
Recent work has employed advanced super-resolution methods based on convolutional neural networks (CNNs) to further improve reconstruction quality.
In this paper, a perceptually-inspired super-resolution approach (M-SRGAN) is proposed for spatial upsampling of compressed video using a modified CNN model.
- Score: 18.72040343193715
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Spatial resolution adaptation is a technique which has often been employed in
video compression to enhance coding efficiency. This approach encodes a lower
resolution version of the input video and reconstructs the original resolution
during decoding. Instead of using conventional up-sampling filters, recent work
has employed advanced super-resolution methods based on convolutional neural
networks (CNNs) to further improve reconstruction quality. These approaches are
usually trained to minimise pixel-based losses such as Mean-Squared Error
(MSE), despite the fact that this type of loss metric does not correlate well
with subjective opinions. In this paper, a perceptually-inspired
super-resolution approach (M-SRGAN) is proposed for spatial up-sampling of
compressed video using a modified CNN model, which has been trained using a
generative adversarial network (GAN) on compressed content with perceptual loss
functions. The proposed method was integrated with HEVC HM 16.20, and has been
evaluated on the JVET Common Test Conditions (UHD test sequences) using the
Random Access configuration. The results show evident perceptual quality
improvement over the original HM 16.20, with an average bitrate saving of 35.6%
(Bj{\o}ntegaard Delta measurement) based on a perceptual quality metric, VMAF.
Related papers
- Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference.
We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - VCISR: Blind Single Image Super-Resolution with Video Compression
Synthetic Data [18.877077302923713]
We present a video compression-based degradation model to synthesize low-resolution image data in the blind SISR task.
Our proposed image synthesizing method is widely applicable to existing image datasets.
By introducing video coding artifacts to SISR degradation models, neural networks can super-resolve images with the ability to restore video compression degradations.
arXiv Detail & Related papers (2023-11-02T05:24:19Z) - Neighbourhood Representative Sampling for Efficient End-to-end Video
Quality Assessment [60.57703721744873]
The increased resolution of real-world videos presents a dilemma between efficiency and accuracy for deep Video Quality Assessment (VQA)
In this work, we propose a unified scheme, spatial-temporal grid mini-cube sampling (St-GMS) to get a novel type of sample, named fragments.
With fragments and FANet, the proposed efficient end-to-end FAST-VQA and FasterVQA achieve significantly better performance than existing approaches on all VQA benchmarks.
arXiv Detail & Related papers (2022-10-11T11:38:07Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Video Coding for Machines with Feature-Based Rate-Distortion
Optimization [7.804710977378487]
With the steady improvement of neural networks, more and more multimedia data is not observed by humans anymore.
We propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance.
We compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO.
arXiv Detail & Related papers (2022-03-11T12:49:50Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - Super-Resolving Compressed Video in Coding Chain [27.994055823226848]
We present a mixed-resolution coding framework, which cooperates with a reference-based DCNN.
In this novel coding chain, the reference-based DCNN learns the direct mapping from low-resolution (LR) compressed video to their high-resolution (HR) clean version at the decoder side.
arXiv Detail & Related papers (2021-03-26T03:39:54Z) - Video Quality Enhancement Using Deep Learning-Based Prediction Models
for Quantized DCT Coefficients in MPEG I-frames [0.0]
We propose a MPEG video decoder based on the frequency-to-frequency domain.
It reads the quantized DCT coefficients received from a low-quality I-frames bitstream and, using a deep learning-based model, predicts the missing coefficients in order to recompose the same frames with enhanced quality.
arXiv Detail & Related papers (2020-10-09T16:41:18Z) - Video compression with low complexity CNN-based spatial resolution
adaptation [15.431248645312309]
spatial resolution adaptation can be integrated within video compression to improve overall coding performance.
A novel framework is proposed which supports the flexible allocation of complexity between the encoder and decoder.
arXiv Detail & Related papers (2020-07-29T10:20:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.