Deep Learning based Full-reference and No-reference Quality Assessment
Models for Compressed UGC Videos
- URL: http://arxiv.org/abs/2106.01111v1
- Date: Wed, 2 Jun 2021 12:23:16 GMT
- Title: Deep Learning based Full-reference and No-reference Quality Assessment
Models for Compressed UGC Videos
- Authors: Wei Sun and Tao Wang and Xiongkuo Min and Fuwang Yi and Guangtao Zhai
- Abstract summary: The framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module.
For the feature extraction module, we fuse the features from intermediate layers of the convolutional neural network (CNN) network into final quality-aware representation.
For the quality regression module, we use the fully connected (FC) layer to regress the quality-aware features into frame-level scores.
- Score: 34.761412637585266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a deep learning based video quality assessment
(VQA) framework to evaluate the quality of the compressed user's generated
content (UGC) videos. The proposed VQA framework consists of three modules, the
feature extraction module, the quality regression module, and the quality
pooling module. For the feature extraction module, we fuse the features from
intermediate layers of the convolutional neural network (CNN) network into
final quality-aware feature representation, which enables the model to make
full use of visual information from low-level to high-level. Specifically, the
structure and texture similarities of feature maps extracted from all
intermediate layers are calculated as the feature representation for the full
reference (FR) VQA model, and the global mean and standard deviation of the
final feature maps fused by intermediate feature maps are calculated as the
feature representation for the no reference (NR) VQA model. For the quality
regression module, we use the fully connected (FC) layer to regress the
quality-aware features into frame-level scores. Finally, a
subjectively-inspired temporal pooling strategy is adopted to pool frame-level
scores into the video-level score. The proposed model achieves the best
performance among the state-of-the-art FR and NR VQA models on the Compressed
UGC VQA database and also achieves pretty good performance on the in-the-wild
UGC VQA databases.
Related papers
- ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment [35.00766551093652]
We propose ReLaX-VQA, a novel No-Reference Video Quality Assessment (NR-VQA) model.
ReLaX-VQA uses fragments of residual frames and optical flow, along with different expressions of spatial features of the sampled frames, to enhance motion and spatial perception.
We will open source the code and trained models to facilitate further research and applications of NR-VQA.
arXiv Detail & Related papers (2024-07-16T08:33:55Z) - CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA)
The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z) - Enhancing Blind Video Quality Assessment with Rich Quality-aware Features [79.18772373737724]
We present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos.
We explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQA models as auxiliary features.
Experimental results demonstrate that the proposed model achieves the best performance on three public social media VQA datasets.
arXiv Detail & Related papers (2024-05-14T16:32:11Z) - Neighbourhood Representative Sampling for Efficient End-to-end Video
Quality Assessment [60.57703721744873]
The increased resolution of real-world videos presents a dilemma between efficiency and accuracy for deep Video Quality Assessment (VQA)
In this work, we propose a unified scheme, spatial-temporal grid mini-cube sampling (St-GMS) to get a novel type of sample, named fragments.
With fragments and FANet, the proposed efficient end-to-end FAST-VQA and FasterVQA achieve significantly better performance than existing approaches on all VQA benchmarks.
arXiv Detail & Related papers (2022-10-11T11:38:07Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - A Deep Learning based No-reference Quality Assessment Model for UGC
Videos [44.00578772367465]
Previous video quality assessment (VQA) studies either use the image recognition model or the image quality assessment (IQA) models to extract frame-level features of videos for quality regression.
We propose a very simple but effective VQA model, which trains an end-to-end spatial feature extraction network to learn the quality-aware spatial feature representation from raw pixels of the video frames.
With the better quality-aware features, we only use the simple multilayer perception layer (MLP) network to regress them into the chunk-level quality scores, and then the temporal average pooling strategy is adopted to obtain the video
arXiv Detail & Related papers (2022-04-29T12:45:21Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.