Related papers: Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

URL: http://arxiv.org/abs/2409.05381v2
Date: Sun, 21 Sep 2025 04:48:18 GMT
Title: Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
Authors: Xudong Li, Zihao Huang, Yan Zhang, Yunhang Shen, Ke Li, Xiawu Zheng, Liujuan Cao, Rongrong Ji,
Abstract summary: Gradient-Regulated Meta-Prompt IQA Framework (GRMP-IQA) designed to efficiently adapt the visual-language pre-trained model, CLIP, to IQA tasks.<n> GRMP-IQA consists of two core modules: (i) Meta-Prompt Pre-training Module and (ii) Quality-Aware Gradient Regularization.
Score: 93.91086467402323
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image Quality Assessment (IQA) remains an unresolved challenge in computer vision due to complex distortions, diverse image content, and limited data availability. Existing Blind IQA (BIQA) methods largely rely on extensive human annotations, which are labor-intensive and costly due to the demanding nature of creating IQA datasets. To reduce this dependency, we propose the Gradient-Regulated Meta-Prompt IQA Framework (GRMP-IQA), designed to efficiently adapt the visual-language pre-trained model, CLIP, to IQA tasks, achieving high accuracy even with limited data. GRMP-IQA consists of two core modules: (i) Meta-Prompt Pre-training Module and (ii) Quality-Aware Gradient Regularization. The Meta Prompt Pre-training Module leverages a meta-learning paradigm to pre-train soft prompts with shared meta-knowledge across different distortions, enabling rapid adaptation to various IQA tasks. On the other hand, the Quality-Aware Gradient Regularization is designed to adjust the update gradients during fine-tuning, focusing the model's attention on quality-relevant features and preventing overfitting to semantic information. Extensive experiments on standard BIQA datasets demonstrate the superior performance to the state-of-the-art BIQA methods under limited data setting. Notably, utilizing just 20% of the training data, GRMP-IQA is competitive with most existing fully supervised BIQA approaches.

Related papers

Zoom-IQA: Image Quality Assessment with Reliable Region-Aware Reasoning [32.30800226412995]
We introduce Zoom-IQA, a VLM-based IQA model to explicitly emulate key cognitive behaviors.<n>We show that Zoom-IQA achieves improved robustness, explainability, and generalization.<n>The application to downstream tasks, such as image restoration, further demonstrates the effectiveness of Zoom-IQA.
arXiv Detail & Related papers (2026-01-06T11:00:17Z)
AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment [69.06977852423564]
Image quality assessment (IQA) reflects both the quantification and interpretation of perceptual quality rooted in the human visual system.<n>AgenticIQA decomposes IQA into four subtasks -- distortion detection, distortion analysis, tool selection, and tool execution.<n>To support training and evaluation, we introduce AgenticIQA-200K, a large-scale instruction dataset tailored for IQA agents, and AgenticIQA-Eval, the first benchmark for assessing the planning, execution, and summarization capabilities of VLM-based IQA agents.
arXiv Detail & Related papers (2025-09-30T09:37:01Z)
Visual-Language Model Knowledge Distillation Method for Image Quality Assessment [0.9821874476902972]
Multimodal methods based on vision-language models, such as CLIP, have demonstrated exceptional generalization capabilities in IQA tasks.<n>This study proposes a visual-language model knowledge distillation method aimed at guiding the training of models with architectural advantages using CLIP's IQA knowledge.
arXiv Detail & Related papers (2025-07-21T14:44:46Z)
TRIQA: Image Quality Assessment by Contrastive Pretraining on Ordered Distortion Triplets [31.2422359004089]
No-Reference (NR) IQA remains particularly challenging due to the absence of a reference image.<n>We propose a novel approach that constructs a custom dataset using a limited number of reference content images.<n>We train a quality-aware model using contrastive triplet-based learning, enabling efficient training with fewer samples.
arXiv Detail & Related papers (2025-07-16T23:43:12Z)
MetaQAP -- A Meta-Learning Approach for Quality-Aware Pretraining in Image Quality Assessment [2.578159662141357]
Image Quality Assessment (IQA) is a critical task in a wide range of applications but remains challenging due to the subjective nature of human perception and the complexity of real-world image distortions.<n>This study proposes MetaQAP, a novel no-reference IQA model designed to address these challenges by leveraging quality-aware pre-training and meta-learning.<n>The proposed MetaQAP model achieved exceptional performance with Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) scores of 0.9885/0.9812 on LiveCD, 0.9702/0.9658 on Kon
arXiv Detail & Related papers (2025-06-19T21:03:47Z)
DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
Blind image quality assessment (IQA) in the wild presents significant challenges. Given the difficulty in collecting large-scale training data, leveraging limited data to develop a model with strong generalization remains an open problem. Motivated by the robust image perception capabilities of pre-trained text-to-image (T2I) diffusion models, we propose a novel IQA method, diffusion priors-based IQA.
arXiv Detail & Related papers (2024-05-30T12:32:35Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Feature Denoising Diffusion Model for Blind Image Quality Assessment [58.5808754919597]
Blind Image Quality Assessment (BIQA) aims to evaluate image quality in line with human perception, without reference benchmarks. Deep learning BIQA methods typically depend on using features from high-level tasks for transfer learning. In this paper, we take an initial step towards exploring the diffusion model for feature denoising in BIQA.
arXiv Detail & Related papers (2024-01-22T13:38:24Z)
Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment [25.5501280406614]
Video quality assessment (VQA) has attracted growing attention in recent years. The great expense of annotating large-scale VQA datasets has become the main obstacle for current deep-learning methods. An Adaptive Diverse Quality-aware feature Acquisition (Ada-DQA) framework is proposed to capture desired quality-related features.
arXiv Detail & Related papers (2023-08-01T16:04:42Z)
Data-Efficient Image Quality Assessment with Attention-Panel Decoder [19.987556370430806]
Blind Image Quality Assessment (BIQA) is a fundamental task in computer vision, which remains unresolved due to the complex distortion conditions and diversified image contents. We propose a novel BIQA pipeline based on the Transformer architecture, which achieves an efficient quality-aware feature representation with much fewer data.
arXiv Detail & Related papers (2023-04-11T03:52:17Z)
Task-Specific Normalization for Continual Learning of Blind Image Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA) The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability. We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score. The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z)
Continual Learning for Blind Image Quality Assessment [80.55119990128419]
Blind image quality assessment (BIQA) models fail to continually adapt to subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. We formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets.
arXiv Detail & Related papers (2021-02-19T03:07:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.