2AFC Prompting of Large Multimodal Models for Image Quality Assessment
- URL: http://arxiv.org/abs/2402.01162v1
- Date: Fri, 2 Feb 2024 06:05:18 GMT
- Title: 2AFC Prompting of Large Multimodal Models for Image Quality Assessment
- Authors: Hanwei Zhu, Xiangjie Sui, Baoliang Chen, Xuelin Liu, Peilin Chen,
Yuming Fang, and Shiqi Wang
- Abstract summary: Two-alternative forced choice (2AFC) prompting is widely regarded as the most reliable way of collecting human opinions of visual quality.
Global quality score of each image estimated by a particular LMM can be efficiently aggregated using the maximum a posterior estimation.
- Score: 38.86162365208038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While abundant research has been conducted on improving high-level visual
understanding and reasoning capabilities of large multimodal models~(LMMs),
their visual quality assessment~(IQA) ability has been relatively
under-explored. Here we take initial steps towards this goal by employing the
two-alternative forced choice~(2AFC) prompting, as 2AFC is widely regarded as
the most reliable way of collecting human opinions of visual quality.
Subsequently, the global quality score of each image estimated by a particular
LMM can be efficiently aggregated using the maximum a posterior estimation.
Meanwhile, we introduce three evaluation criteria: consistency, accuracy, and
correlation, to provide comprehensive quantifications and deeper insights into
the IQA capability of five LMMs. Extensive experiments show that existing LMMs
exhibit remarkable IQA ability on coarse-grained quality comparison, but there
is room for improvement on fine-grained quality discrimination. The proposed
dataset sheds light on the future development of IQA models based on LMMs. The
codes will be made publicly available at https://github.com/h4nwei/2AFC-LMMs.
Related papers
- Q-Ground: Image Quality Grounding with Large Multi-modality Models [61.72022069880346]
We introduce Q-Ground, the first framework aimed at tackling fine-scale visual quality grounding.
Q-Ground combines large multi-modality models with detailed visual quality analysis.
Central to our contribution is the introduction of the QGround-100K dataset.
arXiv Detail & Related papers (2024-07-24T06:42:46Z) - Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare [99.57567498494448]
We introduce Compare2Score, an all-around LMM-based no-reference IQA model.
During training, we generate scaled-up comparative instructions by comparing images from the same IQA dataset.
Experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training.
arXiv Detail & Related papers (2024-05-29T17:26:09Z) - LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM [83.98966702271576]
This study aims to investigate the feasibility of imparting Point Cloud Quality Assessment (PCQA) knowledge to large multi-modality models (LMMs)
We transform quality labels into textual descriptions during the fine-tuning phase, enabling LMMs to derive quality rating logits from 2D projections of point clouds.
Our experimental results affirm the effectiveness of our approach, showcasing a novel integration of LMMs into PCQA.
arXiv Detail & Related papers (2024-04-28T14:47:09Z) - Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level
Vision [85.6008224440157]
Multi-modality Large Language Models (MLLMs) have catalyzed a shift in computer vision from specialized models to general-purpose foundation models.
We present Q-Bench, a holistic benchmark crafted to evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment.
arXiv Detail & Related papers (2023-09-25T14:43:43Z) - MMBench: Is Your Multi-modal Model an All-around Player? [114.45702807380415]
We propose MMBench, a benchmark for assessing the multi-modal capabilities of vision-language models.
MMBench is meticulously curated with well-designed quality control schemes.
MMBench incorporates multiple-choice questions in both English and Chinese versions.
arXiv Detail & Related papers (2023-07-12T16:23:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.