Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare
- URL: http://arxiv.org/abs/2405.19298v1
- Date: Wed, 29 May 2024 17:26:09 GMT
- Title: Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare
- Authors: Hanwei Zhu, Haoning Wu, Yixuan Li, Zicheng Zhang, Baoliang Chen, Lingyu Zhu, Yuming Fang, Guangtao Zhai, Weisi Lin, Shiqi Wang,
- Abstract summary: We introduce Compare2Score, an all-around LMM-based no-reference IQA model.
During training, we generate scaled-up comparative instructions by comparing images from the same IQA dataset.
Experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training.
- Score: 99.57567498494448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored. To address this gap, we introduce Compare2Score-an all-around LMM-based no-reference IQA (NR-IQA) model, which is capable of producing qualitatively comparative responses and effectively translating these discrete comparative levels into a continuous quality score. Specifically, during training, we present to generate scaled-up comparative instructions by comparing images from the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. Utilizing the established large-scale training corpus, we develop a human-like visual quality comparator. During inference, moving beyond binary choices, we propose a soft comparison method that calculates the likelihood of the test image being preferred over multiple predefined anchor images. The quality score is further optimized by maximum a posteriori estimation with the resulting probability matrix. Extensive experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training with converted single image quality score for inference, surpassing state-of-the-art IQA models across diverse scenarios. Moreover, we verify that the probability-matrix-based inference conversion not only improves the rating accuracy of Compare2Score but also zero-shot general-purpose LMMs, suggesting its intrinsic effectiveness.
Related papers
- Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA)
Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z) - Pairwise Comparisons Are All You Need [22.798716660911833]
Blind image quality assessment (BIQA) approaches often fall short in real-world scenarios due to their reliance on a generic quality standard applied uniformly across diverse images.
This paper introduces PICNIQ, a pairwise comparison framework designed to bypass the limitations of conventional BIQA.
By employing psychometric scaling algorithms, PICNIQ transforms pairwise comparisons into just-objectionable-difference (JOD) quality scores, offering a granular and interpretable measure of image quality.
arXiv Detail & Related papers (2024-03-13T23:43:36Z) - Comparison of No-Reference Image Quality Models via MAP Estimation in
Diffusion Latents [99.19391983670569]
We show that NR-IQA models can be plugged into the maximum a posteriori (MAP) estimation framework for image enhancement.
Different NR-IQA models are likely to induce different enhanced images, which are ultimately subject to psychophysical testing.
This leads to a new computational method for comparing NR-IQA models within the analysis-by-synthesis framework.
arXiv Detail & Related papers (2024-03-11T03:35:41Z) - Towards Open-ended Visual Quality Comparison [87.45004129101089]
We extend the edge of emerging large multi-modality models (LMMs) to advance visual quality comparison into open-ended settings.
Co-Instruct is a first-of-its-kind open-source open-ended visual quality comparer.
We demonstrate that Co-Instruct achieves in average 30% higher accuracy than state-of-the-art open-source LMMs.
arXiv Detail & Related papers (2024-02-26T15:10:56Z) - Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models [28.194638379354252]
We introduce a Depicted image Quality Assessment method (DepictQA), overcoming the constraints of traditional score-based methods.
DepictQA allows for detailed, language-based, human-like evaluation of image quality by leveraging Multi-modal Large Language Models.
These results showcase the research potential of multi-modal IQA methods.
arXiv Detail & Related papers (2023-12-14T14:10:02Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Content-Diverse Comparisons improve IQA [23.523537785599913]
Image quality assessment (IQA) forms a natural and often straightforward undertaking for humans, yet effective automation of the task remains challenging.
Recent metrics from the deep learning community commonly compare image pairs during training to improve upon traditional metrics such as PSNR or SSIM.
This restricts the diversity and number of image pairs that the model is exposed to during training.
In this paper, we strive to enrich these comparisons with content diversity. Firstly, we relax comparison constraints, and compare pairs of images with differing content. This increases the variety of available comparisons.
arXiv Detail & Related papers (2022-11-09T21:53:13Z) - Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z) - Comparison of Image Quality Models for Optimization of Image Processing
Systems [41.57409136781606]
We use eleven full-reference IQA models to train deep neural networks for four low-level vision tasks.
Subjective testing on the optimized images allows us to rank the competing models in terms of their perceptual performance.
arXiv Detail & Related papers (2020-05-04T09:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.