Disentangling Aesthetic and Technical Effects for Video Quality
Assessment of User Generated Content
- URL: http://arxiv.org/abs/2211.04894v1
- Date: Wed, 9 Nov 2022 13:55:50 GMT
- Title: Disentangling Aesthetic and Technical Effects for Video Quality
Assessment of User Generated Content
- Authors: Haoning Wu, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu
Sun, Qiong Yan, Weisi Lin
- Abstract summary: The mechanisms of human quality perception in the YouTube-VQA problem is still yet to be explored.
We propose a scheme where two separate evaluators are trained with views specifically designed for each issue.
Our blind subjective studies prove that the separate evaluators in DOVER can effectively match human perception on respective disentangled quality issues.
- Score: 54.31355080688127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: User-generated-content (UGC) videos have dominated the Internet during recent
years. While many methods attempt to objectively assess the quality of these
UGC videos, the mechanisms of human quality perception in the UGC-VQA problem
is still yet to be explored. To better explain the quality perception
mechanisms and learn more robust representations, we aim to disentangle the
effects of aesthetic quality issues and technical quality issues risen by the
complicated video generation processes in the UGC-VQA problem. To overcome the
absence of respective supervisions during disentanglement, we propose the
Limited View Biased Supervisions (LVBS) scheme where two separate evaluators
are trained with decomposed views specifically designed for each issue.
Composed of an Aesthetic Quality Evaluator (AQE) and a Technical Quality
Evaluator (TQE) under the LVBS scheme, the proposed Disentangled Objective
Video Quality Evaluator (DOVER) reach excellent performance (0.91 SRCC for
KoNViD-1k, 0.89 SRCC for LSVQ, 0.88 SRCC for YouTube-UGC) in the UGC-VQA
problem. More importantly, our blind subjective studies prove that the separate
evaluators in DOVER can effectively match human perception on respective
disentangled quality issues. Codes and demos are released in
https://github.com/teowu/dover.
Related papers
- Advancing Video Quality Assessment for AIGC [17.23281750562252]
We propose a novel loss function that combines mean absolute error with cross-entropy loss to mitigate inter-frame quality inconsistencies.
We also introduce the innovative S2CNet technique to retain critical content, while leveraging adversarial training to enhance the model's generalization capabilities.
arXiv Detail & Related papers (2024-09-23T10:36:22Z) - Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model [54.69882562863726]
We try to systemically investigate the AIGC-VQA problem from both subjective and objective quality assessment perspectives.
We evaluate the perceptual quality of AIGC videos from three dimensions: spatial quality, temporal quality, and text-to-video alignment.
We propose a Unify Generated Video Quality assessment (UGVQ) model to comprehensively and accurately evaluate the quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - KVQ: Kwai Video Quality Assessment for Short-form Videos [24.5291786508361]
We establish the first large-scale Kaleidoscope short Video database for Quality assessment, KVQ, which comprises 600 user-uploaded short videos and 3600 processed videos.
We propose the first short-form video quality evaluator, i.e., KSVQE, which enables the quality evaluator to identify the quality-determined semantics with the content understanding of large vision language models.
arXiv Detail & Related papers (2024-02-11T14:37:54Z) - StarVQA+: Co-training Space-Time Attention for Video Quality Assessment [56.548364244708715]
Self-attention based Transformer has achieved great success in many computer vision tasks.
However, its application to video quality assessment (VQA) has not been satisfactory so far.
This paper presents a co-trained Space-Time Attention network for the VQA problem, termed StarVQA+.
arXiv Detail & Related papers (2023-06-21T14:27:31Z) - Towards Explainable In-the-Wild Video Quality Assessment: A Database and
a Language-Prompted Approach [52.07084862209754]
We collect over two million opinions on 4,543 in-the-wild videos on 13 dimensions of quality-related factors.
Specifically, we ask the subjects to label among a positive, a negative, and a neutral choice for each dimension.
These explanation-level opinions allow us to measure the relationships between specific quality factors and abstract subjective quality ratings.
arXiv Detail & Related papers (2023-05-22T05:20:23Z) - Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video
Quality Assessment [54.31355080688127]
We introduce a text-prompted Semantic Affinity Quality Index (SAQI) and its localized version (SAQI-Local) using Contrastive Language-Image Pre-training (CLIP)
BVQI-Local demonstrates unprecedented performance, surpassing existing zero-shot indices by at least 24% on all datasets.
We conduct comprehensive analyses to investigate different quality concerns of distinct indices, demonstrating the effectiveness and rationality of our design.
arXiv Detail & Related papers (2023-04-28T08:06:05Z) - MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos [39.06800945430703]
We build a first-of-a-kind subjective Live VQA database and develop an effective evaluation tool.
textbfMD-VQA achieves state-of-the-art performance on both our Live VQA database and existing compressed VQA databases.
arXiv Detail & Related papers (2023-03-27T06:17:10Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.