A Deep Learning based No-reference Quality Assessment Model for UGC
Videos
- URL: http://arxiv.org/abs/2204.14047v1
- Date: Fri, 29 Apr 2022 12:45:21 GMT
- Title: A Deep Learning based No-reference Quality Assessment Model for UGC
Videos
- Authors: Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai
- Abstract summary: Previous video quality assessment (VQA) studies either use the image recognition model or the image quality assessment (IQA) models to extract frame-level features of videos for quality regression.
We propose a very simple but effective VQA model, which trains an end-to-end spatial feature extraction network to learn the quality-aware spatial feature representation from raw pixels of the video frames.
With the better quality-aware features, we only use the simple multilayer perception layer (MLP) network to regress them into the chunk-level quality scores, and then the temporal average pooling strategy is adopted to obtain the video
- Score: 44.00578772367465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quality assessment for User Generated Content (UGC) videos plays an important
role in ensuring the viewing experience of end-users. Previous UGC video
quality assessment (VQA) studies either use the image recognition model or the
image quality assessment (IQA) models to extract frame-level features of UGC
videos for quality regression, which are regarded as the sub-optimal solutions
because of the domain shifts between these tasks and the UGC VQA task. In this
paper, we propose a very simple but effective UGC VQA model, which tries to
address this problem by training an end-to-end spatial feature extraction
network to directly learn the quality-aware spatial feature representation from
raw pixels of the video frames. We also extract the motion features to measure
the temporal-related distortions that the spatial features cannot model. The
proposed model utilizes very sparse frames to extract spatial features and
dense frames (i.e. the video chunk) with a very low spatial resolution to
extract motion features, which thereby has low computational complexity. With
the better quality-aware features, we only use the simple multilayer perception
layer (MLP) network to regress them into the chunk-level quality scores, and
then the temporal average pooling strategy is adopted to obtain the video-level
quality score. We further introduce a multi-scale quality fusion strategy to
solve the problem of VQA across different spatial resolutions, where the
multi-scale weights are obtained from the contrast sensitivity function of the
human visual system. The experimental results show that the proposed model
achieves the best performance on five popular UGC VQA databases, which
demonstrates the effectiveness of the proposed model. The code will be publicly
available.
Related papers
- ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment [35.00766551093652]
We propose ReLaX-VQA, a novel No-Reference Video Quality Assessment (NR-VQA) model.
ReLaX-VQA uses fragments of residual frames and optical flow, along with different expressions of spatial features of the sampled frames, to enhance motion and spatial perception.
We will open source the code and trained models to facilitate further research and applications of NR-VQA.
arXiv Detail & Related papers (2024-07-16T08:33:55Z) - CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA)
The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z) - Enhancing Blind Video Quality Assessment with Rich Quality-aware Features [79.18772373737724]
We present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos.
We explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQA models as auxiliary features.
Experimental results demonstrate that the proposed model achieves the best performance on three public social media VQA datasets.
arXiv Detail & Related papers (2024-05-14T16:32:11Z) - Neighbourhood Representative Sampling for Efficient End-to-end Video
Quality Assessment [60.57703721744873]
The increased resolution of real-world videos presents a dilemma between efficiency and accuracy for deep Video Quality Assessment (VQA)
In this work, we propose a unified scheme, spatial-temporal grid mini-cube sampling (St-GMS) to get a novel type of sample, named fragments.
With fragments and FANet, the proposed efficient end-to-end FAST-VQA and FasterVQA achieve significantly better performance than existing approaches on all VQA benchmarks.
arXiv Detail & Related papers (2022-10-11T11:38:07Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Deep Learning based Full-reference and No-reference Quality Assessment
Models for Compressed UGC Videos [34.761412637585266]
The framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module.
For the feature extraction module, we fuse the features from intermediate layers of the convolutional neural network (CNN) network into final quality-aware representation.
For the quality regression module, we use the fully connected (FC) layer to regress the quality-aware features into frame-level scores.
arXiv Detail & Related papers (2021-06-02T12:23:16Z) - RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated
Content [44.03188436272383]
We introduce an effective and efficient video quality model for content, which we dub the Rapid and Accurate Video Quality Evaluator (RAPIQUE)
RAPIQUE combines and leverages the advantages of both quality-aware scene statistics features and semantics-aware deep convolutional features.
Our experimental results on recent large-scale video quality databases show that RAPIQUE delivers top performances on all the datasets at a considerably lower computational expense.
arXiv Detail & Related papers (2021-01-26T17:23:46Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.