Related papers: Knowledge Guided Semi-Supervised Learning for Quality Assessment of User Generated Videos

Knowledge Guided Semi-Supervised Learning for Quality Assessment of User Generated Videos

URL: http://arxiv.org/abs/2312.15425v1
Date: Sun, 24 Dec 2023 07:32:03 GMT
Title: Knowledge Guided Semi-Supervised Learning for Quality Assessment of User Generated Videos
Authors: Shankhanil Mitra and Rajiv Soundararajan
Abstract summary: We design a self-supervised framework to generate robust quality aware features for videos. We then propose a dual-model based Semi Supervised Learning (SSL) method specifically designed for the Video Quality Assessment task. Our SSL-VQA method uses the ST-VQRL backbone to produce robust performances across various VQA datasets. Our model improves the state-of-the-art performance when trained only with limited data by around 10%, and by around 15% when unlabelled data is also used in SSL.
Score: 9.681456357957819
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Perceptual quality assessment of user generated content (UGC) videos is challenging due to the requirement of large scale human annotated videos for training. In this work, we address this challenge by first designing a self-supervised Spatio-Temporal Visual Quality Representation Learning (ST-VQRL) framework to generate robust quality aware features for videos. Then, we propose a dual-model based Semi Supervised Learning (SSL) method specifically designed for the Video Quality Assessment (SSL-VQA) task, through a novel knowledge transfer of quality predictions between the two models. Our SSL-VQA method uses the ST-VQRL backbone to produce robust performances across various VQA datasets including cross-database settings, despite being learned with limited human annotated videos. Our model improves the state-of-the-art performance when trained only with limited data by around 10%, and by around 15% when unlabelled data is also used in SSL. Source codes and checkpoints are available at https://github.com/Shankhanil006/SSL-VQA.

Related papers

Bridging Video Quality Scoring and Justification via Large Multimodal Models [14.166920184033463]
Classical video quality assessment (VQA) methods generate a numerical score to judge a video's perceived visual fidelity and clarity.<n>But a score fails to describe the video's complex quality dimensions, restricting its applicability.<n>Benefiting from the linguistic output, adapting video large multimodal models (LMMs) to VQA via instruction tuning has the potential to address this issue.
arXiv Detail & Related papers (2025-06-26T05:02:25Z)
Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision [49.46606936180063]
Video quality assessment (VQA) is essential for quantifying quality in various video processing systems.<n>We introduce a self-supervised learning framework for VQA to learn quality assessment capabilities from large-scale, unlabeled web videos.<n>By training on a dataset $10times$ larger than the existing VQA benchmarks, our model achieves zero-shot performance.
arXiv Detail & Related papers (2025-05-06T15:29:32Z)
Video Quality Assessment: A Comprehensive Survey [55.734935003021576]
Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner consistent with human judgments of perceived quality. We present a survey of recent progress in the development of VQA algorithms and the benchmarking studies and databases that make them possible.
arXiv Detail & Related papers (2024-12-04T05:25:17Z)
VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception. Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation. We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment. The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z)
CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA) The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z)
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild [27.195339506769457]
Video quality assessment (VQA) is a challenging problem due to the numerous factors that can affect the perceptual quality of a video. Annotating the Mean opinion score (MOS) for videos is expensive and time-consuming, which limits the scale of VQA datasets. We propose a VQA method named PTM-VQA, which leverages PreTrained Models to transfer knowledge from models pretrained on various pre-tasks.
arXiv Detail & Related papers (2024-05-28T02:37:29Z)
Enhancing Blind Video Quality Assessment with Rich Quality-aware Features [79.18772373737724]
We present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos. We explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQA models as auxiliary features. Experimental results demonstrate that the proposed model achieves the best performance on three public social media VQA datasets.
arXiv Detail & Related papers (2024-05-14T16:32:11Z)
KVQ: Kwai Video Quality Assessment for Short-form Videos [24.5291786508361]
We establish the first large-scale Kaleidoscope short Video database for Quality assessment, KVQ, which comprises 600 user-uploaded short videos and 3600 processed videos. We propose the first short-form video quality evaluator, i.e., KSVQE, which enables the quality evaluator to identify the quality-determined semantics with the content understanding of large vision language models.
arXiv Detail & Related papers (2024-02-11T14:37:54Z)
Disentangling Aesthetic and Technical Effects for Video Quality Assessment of User Generated Content [54.31355080688127]
The mechanisms of human quality perception in the YouTube-VQA problem is still yet to be explored. We propose a scheme where two separate evaluators are trained with views specifically designed for each issue. Our blind subjective studies prove that the separate evaluators in DOVER can effectively match human perception on respective disentangled quality issues.
arXiv Detail & Related papers (2022-11-09T13:55:50Z)
CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms. Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner. Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z)
Unified Quality Assessment of In-the-Wild Videos with Mixed Datasets Training [20.288424566444224]
We focus on automatically assessing the quality of in-the-wild videos in computer vision applications. To improve the performance of quality assessment models, we borrow intuitions from human perception. We propose a mixed datasets training strategy for training a single VQA model with multiple datasets.
arXiv Detail & Related papers (2020-11-09T09:22:57Z)
UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled. Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models. By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models. Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.