Enhancing image quality prediction with self-supervised visual masking
- URL: http://arxiv.org/abs/2305.19858v2
- Date: Wed, 17 Jan 2024 16:41:01 GMT
- Title: Enhancing image quality prediction with self-supervised visual masking
- Authors: U\u{g}ur \c{C}o\u{g}alan, Mojtaba Bemana, Hans-Peter Seidel, Karol
Myszkowski
- Abstract summary: Full-reference image quality metrics (FR-IQMs) aim to measure the visual differences between a pair of reference and distorted images.
We propose to predict a visual masking model that modulates reference and distorted images in a way that penalizes the visual errors based on their visibility.
Our approach results in enhanced FR-IQM metrics that are more in line with human prediction both visually and quantitatively.
- Score: 20.190853812320395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Full-reference image quality metrics (FR-IQMs) aim to measure the visual
differences between a pair of reference and distorted images, with the goal of
accurately predicting human judgments. However, existing FR-IQMs, including
traditional ones like PSNR and SSIM and even perceptual ones such as HDR-VDP,
LPIPS, and DISTS, still fall short in capturing the complexities and nuances of
human perception. In this work, rather than devising a novel IQM model, we seek
to improve upon the perceptual quality of existing FR-IQM methods. We achieve
this by considering visual masking, an important characteristic of the human
visual system that changes its sensitivity to distortions as a function of
local image content. Specifically, for a given FR-IQM metric, we propose to
predict a visual masking model that modulates reference and distorted images in
a way that penalizes the visual errors based on their visibility. Since the
ground truth visual masks are difficult to obtain, we demonstrate how they can
be derived in a self-supervised manner solely based on mean opinion scores
(MOS) collected from an FR-IQM dataset. Our approach results in enhanced FR-IQM
metrics that are more in line with human prediction both visually and
quantitatively.
Related papers
- ExIQA: Explainable Image Quality Assessment Using Distortion Attributes [0.3683202928838613]
We propose an explainable approach for distortion identification based on attribute learning.
We generate a dataset consisting of 100,000 images for efficient training.
Our approach achieves state-of-the-art (SOTA) performance across multiple datasets in both PLCC and SRCC metrics.
arXiv Detail & Related papers (2024-09-10T20:28:14Z) - Sliced Maximal Information Coefficient: A Training-Free Approach for Image Quality Assessment Enhancement [12.628718661568048]
We aim to explore a generalized human visual attention estimation strategy to mimic the process of human quality rating.
In particular, we model human attention generation by measuring the statistical dependency between the degraded image and the reference image.
Experimental results verify the performance of existing IQA models can be consistently improved when our attention module is incorporated.
arXiv Detail & Related papers (2024-08-19T11:55:32Z) - DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
Blind image quality assessment (IQA) in the wild presents significant challenges.
Given the difficulty in collecting large-scale training data, leveraging limited data to develop a model with strong generalization remains an open problem.
Motivated by the robust image perception capabilities of pre-trained text-to-image (T2I) diffusion models, we propose a novel IQA method, diffusion priors-based IQA.
arXiv Detail & Related papers (2024-05-30T12:32:35Z) - Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA)
Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z) - Reference-Free Image Quality Metric for Degradation and Reconstruction Artifacts [2.5282283486446753]
We develop a reference-free quality evaluation network, dubbed "Quality Factor (QF) Predictor"
Our QF Predictor is a lightweight, fully convolutional network comprising seven layers.
It receives JPEG compressed image patch with a random QF as input, is trained to accurately predict the corresponding QF.
arXiv Detail & Related papers (2024-05-01T22:28:18Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Perceptual Attacks of No-Reference Image Quality Models with
Human-in-the-Loop [113.75573175709573]
We make one of the first attempts to examine the perceptual robustness of NR-IQA models.
We test one knowledge-driven and three data-driven NR-IQA methods under four full-reference IQA models.
We find that all four NR-IQA models are vulnerable to the proposed perceptual attack.
arXiv Detail & Related papers (2022-10-03T13:47:16Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z) - Conformer and Blind Noisy Students for Improved Image Quality Assessment [80.57006406834466]
Learning-based approaches for perceptual image quality assessment (IQA) usually require both the distorted and reference image for measuring the perceptual quality accurately.
In this work, we explore the performance of transformer-based full-reference IQA models.
We also propose a method for IQA based on semi-supervised knowledge distillation from full-reference teacher models into blind student models.
arXiv Detail & Related papers (2022-04-27T10:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.