Perceptual Quality Assessment of 360$^\circ$ Images Based on Generative
Scanpath Representation
- URL: http://arxiv.org/abs/2309.03472v1
- Date: Thu, 7 Sep 2023 04:10:30 GMT
- Title: Perceptual Quality Assessment of 360$^\circ$ Images Based on Generative
Scanpath Representation
- Authors: Xiangjie Sui, Hanwei Zhu, Xuelin Liu, Yuming Fang, Shiqi Wang, Zhou
Wang
- Abstract summary: We introduce a unique generative scanpath representation (GSR) for effective quality inference of 360$circ$ images.
GSR aggregates varied perceptual experiences of multi-hypothesis users under a predefined viewing condition.
We then propose an efficient OIQA computational framework by learning the quality maps GSR.
- Score: 40.00063797833765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite substantial efforts dedicated to the design of heuristic models for
omnidirectional (i.e., 360$^\circ$) image quality assessment (OIQA), a
conspicuous gap remains due to the lack of consideration for the diversity of
viewing behaviors that leads to the varying perceptual quality of 360$^\circ$
images. Two critical aspects underline this oversight: the neglect of viewing
conditions that significantly sway user gaze patterns and the overreliance on a
single viewport sequence from the 360$^\circ$ image for quality inference. To
address these issues, we introduce a unique generative scanpath representation
(GSR) for effective quality inference of 360$^\circ$ images, which aggregates
varied perceptual experiences of multi-hypothesis users under a predefined
viewing condition. More specifically, given a viewing condition characterized
by the starting point of viewing and exploration time, a set of scanpaths
consisting of dynamic visual fixations can be produced using an apt scanpath
generator. Following this vein, we use the scanpaths to convert the 360$^\circ$
image into the unique GSR, which provides a global overview of gazed-focused
contents derived from scanpaths. As such, the quality inference of the
360$^\circ$ image is swiftly transformed to that of GSR. We then propose an
efficient OIQA computational framework by learning the quality maps of GSR.
Comprehensive experimental results validate that the predictions of the
proposed framework are highly consistent with human perception in the
spatiotemporal domain, especially in the challenging context of locally
distorted 360$^\circ$ images under varied viewing conditions. The code will be
released at https://github.com/xiangjieSui/GSR
Related papers
- 360VOT: A New Benchmark Dataset for Omnidirectional Visual Object
Tracking [10.87309734945868]
360deg images can provide an omnidirectional field of view which is important for stable and long-term scene perception.
In this paper, we explore 360deg images for visual object tracking and perceive new challenges caused by large distortion.
We propose a new large-scale omnidirectional tracking benchmark dataset, 360VOT, in order to facilitate future research.
arXiv Detail & Related papers (2023-07-27T05:32:01Z) - Assessor360: Multi-sequence Network for Blind Omnidirectional Image
Quality Assessment [50.82681686110528]
Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs)
The quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipeline lacks the modeling of the observer's browsing process.
We propose a novel multi-sequence network for BOIQA called Assessor360, which is derived from the realistic multi-assessor ODI quality assessment procedure.
arXiv Detail & Related papers (2023-05-18T13:55:28Z) - ST360IQ: No-Reference Omnidirectional Image Quality Assessment with
Spherical Vision Transformers [17.48330099000856]
We present a method for no-reference 360 image quality assessment.
Our approach predicts the quality of an omnidirectional image correlated with the human-perceived image quality.
arXiv Detail & Related papers (2023-03-13T07:48:46Z) - Blind Omnidirectional Image Quality Assessment: Integrating Local
Statistics and Global Semantics [14.586878663223832]
We propose a blind/no-reference OIQA method named S$2$ that bridges the gap between low-level statistics and high-level semantics of omnidirectional images.
A quality regression along with a weighting process is then followed that maps the extracted quality-aware features to a perceptual quality prediction.
arXiv Detail & Related papers (2023-02-24T01:47:13Z) - Panoramic Vision Transformer for Saliency Detection in 360{\deg} Videos [48.54829780502176]
We present a new framework named Panoramic Vision Transformer (PAVER)
We design the encoder using Vision Transformer with deformable convolution, which enables us to plug pretrained models from normal videos into our architecture without additional modules or finetuning.
We demonstrate the utility of our saliency prediction model with the omnidirectional video quality assessment task in VQA-ODV, where we consistently improve performance without any form of supervision.
arXiv Detail & Related papers (2022-09-19T12:23:34Z) - ScanGAN360: A Generative Model of Realistic Scanpaths for 360$^{\circ}$
Images [92.8211658773467]
We present ScanGAN360, a new generative adversarial approach to generate scanpaths for 360$circ$ images.
We accomplish this by leveraging the use of a spherical adaptation of dynamic-time warping as a loss function.
The quality of our scanpaths outperforms competing approaches by a large margin and is almost on par with the human baseline.
arXiv Detail & Related papers (2021-03-25T15:34:18Z) - Perceptual Quality Assessment of Omnidirectional Images as Moving Camera
Videos [49.217528156417906]
Two types of VR viewing conditions are crucial in determining the viewing behaviors of users and the perceived quality of the panorama.
We first transform an omnidirectional image to several video representations using different user viewing behaviors under different viewing conditions.
We then leverage advanced 2D full-reference video quality models to compute the perceived quality.
arXiv Detail & Related papers (2020-05-21T10:03:40Z) - Visual Question Answering on 360{\deg} Images [96.00046925811515]
VQA 360 is a novel task of visual question answering on 360 images.
We collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answer triplets for a variety of question types.
arXiv Detail & Related papers (2020-01-10T08:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.