Knowledge Guided Semi-Supervised Learning for Quality Assessment of User
Generated Videos
- URL: http://arxiv.org/abs/2312.15425v1
- Date: Sun, 24 Dec 2023 07:32:03 GMT
- Title: Knowledge Guided Semi-Supervised Learning for Quality Assessment of User
Generated Videos
- Authors: Shankhanil Mitra and Rajiv Soundararajan
- Abstract summary: We design a self-supervised framework to generate robust quality aware features for videos.
We then propose a dual-model based Semi Supervised Learning (SSL) method specifically designed for the Video Quality Assessment task.
Our SSL-VQA method uses the ST-VQRL backbone to produce robust performances across various VQA datasets.
Our model improves the state-of-the-art performance when trained only with limited data by around 10%, and by around 15% when unlabelled data is also used in SSL.
- Score: 9.681456357957819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perceptual quality assessment of user generated content (UGC) videos is
challenging due to the requirement of large scale human annotated videos for
training. In this work, we address this challenge by first designing a
self-supervised Spatio-Temporal Visual Quality Representation Learning
(ST-VQRL) framework to generate robust quality aware features for videos. Then,
we propose a dual-model based Semi Supervised Learning (SSL) method
specifically designed for the Video Quality Assessment (SSL-VQA) task, through
a novel knowledge transfer of quality predictions between the two models. Our
SSL-VQA method uses the ST-VQRL backbone to produce robust performances across
various VQA datasets including cross-database settings, despite being learned
with limited human annotated videos. Our model improves the state-of-the-art
performance when trained only with limited data by around 10%, and by around
15% when unlabelled data is also used in SSL. Source codes and checkpoints are
available at https://github.com/Shankhanil006/SSL-VQA.
Related papers
- VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception.
Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation.
We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment.
The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA)
The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z) - PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild [27.195339506769457]
Video quality assessment (VQA) is a challenging problem due to the numerous factors that can affect the perceptual quality of a video.
Annotating the Mean opinion score (MOS) for videos is expensive and time-consuming, which limits the scale of VQA datasets.
We propose a VQA method named PTM-VQA, which leverages PreTrained Models to transfer knowledge from models pretrained on various pre-tasks.
arXiv Detail & Related papers (2024-05-28T02:37:29Z) - Enhancing Blind Video Quality Assessment with Rich Quality-aware Features [79.18772373737724]
We present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos.
We explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQA models as auxiliary features.
Experimental results demonstrate that the proposed model achieves the best performance on three public social media VQA datasets.
arXiv Detail & Related papers (2024-05-14T16:32:11Z) - KVQ: Kwai Video Quality Assessment for Short-form Videos [24.5291786508361]
We establish the first large-scale Kaleidoscope short Video database for Quality assessment, KVQ, which comprises 600 user-uploaded short videos and 3600 processed videos.
We propose the first short-form video quality evaluator, i.e., KSVQE, which enables the quality evaluator to identify the quality-determined semantics with the content understanding of large vision language models.
arXiv Detail & Related papers (2024-02-11T14:37:54Z) - Disentangling Aesthetic and Technical Effects for Video Quality
Assessment of User Generated Content [54.31355080688127]
The mechanisms of human quality perception in the YouTube-VQA problem is still yet to be explored.
We propose a scheme where two separate evaluators are trained with views specifically designed for each issue.
Our blind subjective studies prove that the separate evaluators in DOVER can effectively match human perception on respective disentangled quality issues.
arXiv Detail & Related papers (2022-11-09T13:55:50Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Unified Quality Assessment of In-the-Wild Videos with Mixed Datasets
Training [20.288424566444224]
We focus on automatically assessing the quality of in-the-wild videos in computer vision applications.
To improve the performance of quality assessment models, we borrow intuitions from human perception.
We propose a mixed datasets training strategy for training a single VQA model with multiple datasets.
arXiv Detail & Related papers (2020-11-09T09:22:57Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.