Perceptual Quality Assessment of Omnidirectional Audio-visual Signals
- URL: http://arxiv.org/abs/2307.10813v1
- Date: Thu, 20 Jul 2023 12:21:26 GMT
- Title: Perceptual Quality Assessment of Omnidirectional Audio-visual Signals
- Authors: Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li
Chen, Xiongkuo Min, Guangtao Zhai
- Abstract summary: Most existing quality assessment studies for omnidirectional videos (ODVs) only focus on the visual distortions of videos.
In this paper, we first establish a large-scale audio-visual quality assessment dataset for ODVs.
Then, we design three baseline methods for full-reference omnidirectional audio-visual quality assessment (OAVQA)
- Score: 37.73157112698111
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Omnidirectional videos (ODVs) play an increasingly important role in the
application fields of medical, education, advertising, tourism, etc. Assessing
the quality of ODVs is significant for service-providers to improve the user's
Quality of Experience (QoE). However, most existing quality assessment studies
for ODVs only focus on the visual distortions of videos, while ignoring that
the overall QoE also depends on the accompanying audio signals. In this paper,
we first establish a large-scale audio-visual quality assessment dataset for
omnidirectional videos, which includes 375 distorted omnidirectional
audio-visual (A/V) sequences generated from 15 high-quality pristine
omnidirectional A/V contents, and the corresponding perceptual audio-visual
quality scores. Then, we design three baseline methods for full-reference
omnidirectional audio-visual quality assessment (OAVQA), which combine existing
state-of-the-art single-mode audio and video QA models via multimodal fusion
strategies. We validate the effectiveness of the A/V multimodal fusion method
for OAVQA on our dataset, which provides a new benchmark for omnidirectional
QoE evaluation. Our dataset is available at https://github.com/iamazxl/OAVQA.
Related papers
- Perceptual Depth Quality Assessment of Stereoscopic Omnidirectional Images [10.382801621282228]
We develop an objective quality assessment model named depth quality index (DQI) for efficient no-reference (NR) depth quality assessment of stereoscopic omnidirectional images.
Motivated by the perceptual characteristics of the human visual system (HVS), the proposed DQI is built upon multi-color-channel, adaptive viewport selection, and interocular discrepancy features.
arXiv Detail & Related papers (2024-08-19T16:28:05Z) - How Does Audio Influence Visual Attention in Omnidirectional Videos? Database and Model [50.15552768350462]
This paper comprehensively investigates audio-visual attention in omnidirectional videos (ODVs) from both subjective and objective perspectives.
To advance the research on audio-visual saliency prediction for ODVs, we establish a new benchmark based on the AVS-ODV database.
arXiv Detail & Related papers (2024-08-10T02:45:46Z) - Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model [54.69882562863726]
We try to systemically investigate the AIGC-VQA problem from both subjective and objective quality assessment perspectives.
We evaluate the perceptual quality of AIGC videos from three dimensions: spatial quality, temporal quality, and text-to-video alignment.
We propose a Unify Generated Video Quality assessment (UGVQ) model to comprehensively and accurately evaluate the quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - Audio-visual Saliency for Omnidirectional Videos [58.086575606742116]
We first establish the largest audio-visual saliency dataset for omnidirectional videos (AVS-ODV)
We analyze the visual attention behavior of the observers under various omnidirectional audio modalities and visual scenes based on the AVS-ODV dataset.
arXiv Detail & Related papers (2023-11-09T08:03:40Z) - Towards Explainable In-the-Wild Video Quality Assessment: A Database and
a Language-Prompted Approach [52.07084862209754]
We collect over two million opinions on 4,543 in-the-wild videos on 13 dimensions of quality-related factors.
Specifically, we ask the subjects to label among a positive, a negative, and a neutral choice for each dimension.
These explanation-level opinions allow us to measure the relationships between specific quality factors and abstract subjective quality ratings.
arXiv Detail & Related papers (2023-05-22T05:20:23Z) - Audio-Visual Quality Assessment for User Generated Content: Database and
Method [61.970768267688086]
Most existing VQA studies only focus on the visual distortions of videos, ignoring that the user's QoE also depends on the accompanying audio signals.
We construct the first AVQA database named the SJTU-UAV database, which includes 520 in-the-wild audio and video (A/V) sequences.
We also design a family of AVQA models, which fuse the popular VQA methods and audio features via support vector regressor (SVR)
The experimental results show that with the help of audio signals, the VQA models can evaluate the quality more accurately.
arXiv Detail & Related papers (2023-03-04T11:49:42Z) - A Comprehensive Survey on Video Saliency Detection with Auditory
Information: the Audio-visual Consistency Perceptual is the Key! [25.436683033432086]
Video saliency detection (VSD) aims at fast locating the most attractive objects/things/patterns in a given video clip.
This paper provides extensive review to bridge the gap between audio-visual fusion and saliency detection.
arXiv Detail & Related papers (2022-06-20T07:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.