Audio-Visual Quality Assessment for User Generated Content: Database and
Method
- URL: http://arxiv.org/abs/2303.02392v2
- Date: Wed, 27 Dec 2023 06:54:22 GMT
- Title: Audio-Visual Quality Assessment for User Generated Content: Database and
Method
- Authors: Yuqin Cao, Xiongkuo Min, Wei Sun, Xiaoping Zhang, Guangtao Zhai
- Abstract summary: Most existing VQA studies only focus on the visual distortions of videos, ignoring that the user's QoE also depends on the accompanying audio signals.
We construct the first AVQA database named the SJTU-UAV database, which includes 520 in-the-wild audio and video (A/V) sequences.
We also design a family of AVQA models, which fuse the popular VQA methods and audio features via support vector regressor (SVR)
The experimental results show that with the help of audio signals, the VQA models can evaluate the quality more accurately.
- Score: 61.970768267688086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the explosive increase of User Generated Content (UGC), UGC video
quality assessment (VQA) becomes more and more important for improving users'
Quality of Experience (QoE). However, most existing UGC VQA studies only focus
on the visual distortions of videos, ignoring that the user's QoE also depends
on the accompanying audio signals. In this paper, we conduct the first study to
address the problem of UGC audio and video quality assessment (AVQA).
Specifically, we construct the first UGC AVQA database named the SJTU-UAV
database, which includes 520 in-the-wild UGC audio and video (A/V) sequences,
and conduct a user study to obtain the mean opinion scores of the A/V
sequences. The content of the SJTU-UAV database is then analyzed from both the
audio and video aspects to show the database characteristics. We also design a
family of AVQA models, which fuse the popular VQA methods and audio features
via support vector regressor (SVR). We validate the effectiveness of the
proposed models on the three databases. The experimental results show that with
the help of audio signals, the VQA models can evaluate the perceptual quality
more accurately. The database will be released to facilitate further research.
Related papers
- VQA$^2$:Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment originally focused on quantitative video quality scoring.
It is now evolving towards more comprehensive visual quality understanding tasks.
We introduce the first visual question answering instruction dataset entirely focuses on video quality assessment.
We conduct extensive experiments on both video quality scoring and video quality understanding tasks.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model [54.69882562863726]
We try to systemically investigate the AIGC-VQA problem from both subjective and objective quality assessment perspectives.
We evaluate the perceptual quality of AIGC videos from three dimensions: spatial quality, temporal quality, and text-to-video alignment.
We propose a Unify Generated Video Quality assessment (UGVQ) model to comprehensively and accurately evaluate the quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - Perceptual Quality Assessment of Omnidirectional Audio-visual Signals [37.73157112698111]
Most existing quality assessment studies for omnidirectional videos (ODVs) only focus on the visual distortions of videos.
In this paper, we first establish a large-scale audio-visual quality assessment dataset for ODVs.
Then, we design three baseline methods for full-reference omnidirectional audio-visual quality assessment (OAVQA)
arXiv Detail & Related papers (2023-07-20T12:21:26Z) - StarVQA+: Co-training Space-Time Attention for Video Quality Assessment [56.548364244708715]
Self-attention based Transformer has achieved great success in many computer vision tasks.
However, its application to video quality assessment (VQA) has not been satisfactory so far.
This paper presents a co-trained Space-Time Attention network for the VQA problem, termed StarVQA+.
arXiv Detail & Related papers (2023-06-21T14:27:31Z) - MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos [39.06800945430703]
We build a first-of-a-kind subjective Live VQA database and develop an effective evaluation tool.
textbfMD-VQA achieves state-of-the-art performance on both our Live VQA database and existing compressed VQA databases.
arXiv Detail & Related papers (2023-03-27T06:17:10Z) - Learning to Answer Questions in Dynamic Audio-Visual Scenarios [81.19017026999218]
We focus on the Audio-Visual Questioning (AVQA) task, which aims to answer questions regarding different visual objects sounds, and their associations in videos.
Our dataset contains more than 45K question-answer pairs spanning over different modalities and question types.
Our results demonstrate that AVQA benefits from multisensory perception and our model outperforms recent A-SIC, V-SIC, and AVQA approaches.
arXiv Detail & Related papers (2022-03-26T13:03:42Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.