Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model
- URL: http://arxiv.org/abs/2502.15271v1
- Date: Fri, 21 Feb 2025 07:54:00 GMT
- Title: Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model
- Authors: Jiebin Yan, Ziwen Tan, Yuming Fang, Junjie Chen, Wenhui Jiang, Zhou Wang,
- Abstract summary: We conduct the largest study so far on omnidirectional image quality assessment (OIQA) using a large-scale database called OIQ-10K.<n>A comprehensive psychophysical study is elaborated to collect human opinions for each omnidirectional image.<n>We propose a novel adaptive feature-tailoring OIQA model named IQCaption360, which is capable of generating a quality caption for an omnidirectional image.
- Score: 35.232181599179306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The fast growing application of omnidirectional images calls for effective approaches for omnidirectional image quality assessment (OIQA). Existing OIQA methods have been developed and tested on homogeneously distorted omnidirectional images, but it is hard to transfer their success directly to the heterogeneously distorted omnidirectional images. In this paper, we conduct the largest study so far on OIQA, where we establish a large-scale database called OIQ-10K containing 10,000 omnidirectional images with both homogeneous and heterogeneous distortions. A comprehensive psychophysical study is elaborated to collect human opinions for each omnidirectional image, together with the spatial distributions (within local regions or globally) of distortions, and the head and eye movements of the subjects. Furthermore, we propose a novel multitask-derived adaptive feature-tailoring OIQA model named IQCaption360, which is capable of generating a quality caption for an omnidirectional image in a manner of textual template. Extensive experiments demonstrate the effectiveness of IQCaption360, which outperforms state-of-the-art methods by a significant margin on the proposed OIQ-10K database. The OIQ-10K database and the related source codes are available at https://github.com/WenJuing/IQCaption360.
Related papers
- Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption [21.811319260270732]
Full-reference image quality assessment (FR-IQA) generally assumes that reference images are of perfect quality.
Recent generative enhancement methods are capable of producing images of higher quality than their original.
We present a generalized FR-IQA model, namely Adaptive Fidelity-Naturalness Evaluator (A-FINE), to accurately assess and adaptively combine the fidelity and naturalness of a test image.
arXiv Detail & Related papers (2025-03-14T09:12:03Z) - Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention [30.688264840230755]
We propose a novel and effective blind omnidirectional image quality assessment model with multi-axis attention (Max360IQ)
Max360IQ can proficiently measure not only the quality of uniformly distorted omnidirectional images but also the quality of non-uniformly distorted omnidirectional images.
arXiv Detail & Related papers (2025-02-26T11:01:03Z) - Subjective and Objective Quality Assessment of Non-Uniformly Distorted Omnidirectional Images [33.10692798685548]
We propose a perception-guided OIQA model for non-uniform distortion by adaptively simulating users' viewing behavior.<n> Experimental results demonstrate that the proposed model outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-01-20T14:39:50Z) - DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
Blind image quality assessment (IQA) in the wild presents significant challenges.
Given the difficulty in collecting large-scale training data, leveraging limited data to develop a model with strong generalization remains an open problem.
Motivated by the robust image perception capabilities of pre-trained text-to-image (T2I) diffusion models, we propose a novel IQA method, diffusion priors-based IQA.
arXiv Detail & Related papers (2024-05-30T12:32:35Z) - Cross-IQA: Unsupervised Learning for Image Quality Assessment [3.2287957986061038]
We propose a no-reference image quality assessment (NR-IQA) method termed Cross-IQA based on vision transformer(ViT) model.
The proposed Cross-IQA method can learn image quality features from unlabeled image data.
Experimental results show that Cross-IQA can achieve state-of-the-art performance in assessing the low-frequency degradation information.
arXiv Detail & Related papers (2024-05-07T13:35:51Z) - Image Quality Assessment With Compressed Sampling [5.76395285614395]
We propose two networks for NR-IQA with Compressive Sampling (dubbed CL-IQA and CS-IQA)
They consist of four components: (1) The Compressed Sampling Module (CSM) to sample the image (2)The Adaptive Embedding Module (AEM) to extract high-level features.
Experiments show that our proposed methods outperform other methods on various datasets with less data usage.
arXiv Detail & Related papers (2024-04-26T05:51:57Z) - AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024.
A subjective IQA experiment is conducted to assess human visual preferences from three perspectives.
We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z) - When No-Reference Image Quality Models Meet MAP Estimation in Diffusion Latents [92.45867913876691]
No-reference image quality assessment (NR-IQA) models can effectively quantify perceived image quality.
We show that NR-IQA models can be plugged into the maximum a posteriori (MAP) estimation framework for image enhancement.
arXiv Detail & Related papers (2024-03-11T03:35:41Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Perceptual Attacks of No-Reference Image Quality Models with
Human-in-the-Loop [113.75573175709573]
We make one of the first attempts to examine the perceptual robustness of NR-IQA models.
We test one knowledge-driven and three data-driven NR-IQA methods under four full-reference IQA models.
We find that all four NR-IQA models are vulnerable to the proposed perceptual attack.
arXiv Detail & Related papers (2022-10-03T13:47:16Z) - Perceptual Quality Assessment of Omnidirectional Images [81.76416696753947]
We first establish an omnidirectional IQA (OIQA) database, which includes 16 source images and 320 distorted images degraded by 4 commonly encountered distortion types.
Then a subjective quality evaluation study is conducted on the OIQA database in the VR environment.
The original and distorted omnidirectional images, subjective quality ratings, and the head and eye movement data together constitute the OIQA database.
arXiv Detail & Related papers (2022-07-06T13:40:38Z) - MUSIQ: Multi-scale Image Quality Transformer [22.908901641767688]
Current state-of-the-art IQA methods are based on convolutional neural networks (CNNs)
We design a multi-scale image quality Transformer (MUSIQ) to process native resolution images with varying sizes and aspect ratios.
With a multi-scale image representation, our proposed method can capture image quality at different granularities.
arXiv Detail & Related papers (2021-08-12T23:36:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.