Mitigating Perception Bias: A Training-Free Approach to Enhance LMM for Image Quality Assessment
- URL: http://arxiv.org/abs/2411.12791v1
- Date: Tue, 19 Nov 2024 15:00:59 GMT
- Title: Mitigating Perception Bias: A Training-Free Approach to Enhance LMM for Image Quality Assessment
- Authors: Siyi Pan, Baoliang Chen, Danni Huang, Hanwei Zhu, Lingyu Zhu, Xiangjie Sui, Shiqi Wang,
- Abstract summary: We propose a training-free debiasing framework for image quality assessment.
We first explore several semantic-preserving distortions that can significantly degrade image quality.
We then apply these specific distortions to the query or test images.
During quality inference, both a query image and its corresponding degraded version are fed to the LMM.
All degraded images are consistently rated as poor quality, regardless of their semantic difference.
- Score: 18.622560025505233
- License:
- Abstract: Despite the impressive performance of large multimodal models (LMMs) in high-level visual tasks, their capacity for image quality assessment (IQA) remains limited. One main reason is that LMMs are primarily trained for high-level tasks (e.g., image captioning), emphasizing unified image semantics extraction under varied quality. Such semantic-aware yet quality-insensitive perception bias inevitably leads to a heavy reliance on image semantics when those LMMs are forced for quality rating. In this paper, instead of retraining or tuning an LMM costly, we propose a training-free debiasing framework, in which the image quality prediction is rectified by mitigating the bias caused by image semantics. Specifically, we first explore several semantic-preserving distortions that can significantly degrade image quality while maintaining identifiable semantics. By applying these specific distortions to the query or test images, we ensure that the degraded images are recognized as poor quality while their semantics remain. During quality inference, both a query image and its corresponding degraded version are fed to the LMM along with a prompt indicating that the query image quality should be inferred under the condition that the degraded one is deemed poor quality.This prior condition effectively aligns the LMM's quality perception, as all degraded images are consistently rated as poor quality, regardless of their semantic difference.Finally, the quality scores of the query image inferred under different prior conditions (degraded versions) are aggregated using a conditional probability model. Extensive experiments on various IQA datasets show that our debiasing framework could consistently enhance the LMM performance and the code will be publicly available.
Related papers
- Dual-Representation Interaction Driven Image Quality Assessment with Restoration Assistance [11.983231834400698]
No-Reference Image Quality Assessment for distorted images has always been a challenging problem due to image content variance and distortion diversity.
Previous IQA models mostly encode explicit single-quality features of synthetic images to obtain quality-aware representations for quality score prediction.
We introduce the DRI method to obtain degradation vectors and quality vectors of images, which separately model the degradation and quality information of low-quality images.
arXiv Detail & Related papers (2024-11-26T12:48:47Z) - Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA)
We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it.
We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z) - Reference-Free Image Quality Metric for Degradation and Reconstruction Artifacts [2.5282283486446753]
We develop a reference-free quality evaluation network, dubbed "Quality Factor (QF) Predictor"
Our QF Predictor is a lightweight, fully convolutional network comprising seven layers.
It receives JPEG compressed image patch with a random QF as input, is trained to accurately predict the corresponding QF.
arXiv Detail & Related papers (2024-05-01T22:28:18Z) - VisualCritic: Making LMMs Perceive Visual Quality Like Humans [65.59779450136399]
We present VisualCritic, the first LMM for broad-spectrum image subjective quality assessment.
VisualCritic can be used across diverse data right out of box, without any requirements of dataset-specific adaptation operations.
arXiv Detail & Related papers (2024-03-19T15:07:08Z) - Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity [55.399230250413986]
We propose a Quality-Aware Feature Matching IQA Metric (QFM-IQM) to remove harmful semantic noise features from the upstream task.
Our approach achieves superior performance to the state-of-the-art NR-IQA methods on eight standard IQA datasets.
arXiv Detail & Related papers (2023-12-11T06:50:27Z) - Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of
Low-light Images [73.27643795557778]
Blind image quality assessment (BIQA) aims at automatically and accurately forecasting objective scores for visual signals.
Recent developments in this field are dominated by unimodal solutions inconsistent with human subjective rating patterns.
We present a unique blind multimodal quality assessment (BMQA) of low-light images from subjective evaluation to objective score.
arXiv Detail & Related papers (2023-03-18T09:04:55Z) - UNO-QA: An Unsupervised Anomaly-Aware Framework with Test-Time
Clustering for OCTA Image Quality Assessment [4.901218498977952]
We propose an unsupervised anomaly-aware framework with test-time clustering for optical coherence tomography angiography ( OCTA) image quality assessment.
A feature-embedding-based low-quality representation module is proposed to quantify the quality of OCTA images.
We perform dimension reduction and clustering of multi-scale image features extracted by the trained OCTA quality representation network.
arXiv Detail & Related papers (2022-12-20T18:48:04Z) - Feedback is Needed for Retakes: An Explainable Poor Image Notification
Framework for the Visually Impaired [6.0158981171030685]
Our framework first determines the quality of images and then generates captions using only those images that are determined to be of high quality.
The user is notified by the flaws feature to retake if image quality is low, and this cycle is repeated until the input image is deemed to be of high quality.
arXiv Detail & Related papers (2022-11-17T09:22:28Z) - Learning Conditional Knowledge Distillation for Degraded-Reference Image
Quality Assessment [157.1292674649519]
We propose a practical solution named degraded-reference IQA (DR-IQA)
DR-IQA exploits the inputs of IR models, degraded images, as references.
Our results can even be close to the performance of full-reference settings.
arXiv Detail & Related papers (2021-08-18T02:35:08Z) - Perceptual Image Restoration with High-Quality Priori and Degradation
Learning [28.93489249639681]
We show that our model performs well in measuring the similarity between restored and degraded images.
Our simultaneous restoration and enhancement framework generalizes well to real-world complicated degradation types.
arXiv Detail & Related papers (2021-03-04T13:19:50Z) - Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and
Wild [98.48284827503409]
We develop a textitunified BIQA model and an approach of training it for both synthetic and realistic distortions.
We employ the fidelity loss to optimize a deep neural network for BIQA over a large number of such image pairs.
Experiments on six IQA databases show the promise of the learned method in blindly assessing image quality in the laboratory and wild.
arXiv Detail & Related papers (2020-05-28T13:35:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.