Related papers: Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

URL: http://arxiv.org/abs/2303.14968v1
Date: Mon, 27 Mar 2023 07:58:09 GMT
Title: Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Authors: Weixia Zhang and Guangtao Zhai and Ying Wei and Xiaokang Yang and Kede Ma
Abstract summary: Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
Score: 93.56647950778357
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We aim at advancing blind image quality assessment (BIQA), which predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks, in a way that the model parameter sharing and the loss weighting are determined automatically. Specifically, we first describe all candidate label combinations (from multiple tasks) using a textual template, and compute the joint probability from the cosine similarities of the visual-textual embeddings. Predictions of each task can be inferred from the joint distribution, and optimized by carefully designed loss functions. Through comprehensive experiments on learning three tasks - BIQA, scene classification, and distortion type identification, we verify that the proposed BIQA method 1) benefits from the scene classification and distortion type identification tasks and outperforms the state-of-the-art on multiple IQA datasets, 2) is more robust in the group maximum differentiation competition, and 3) realigns the quality annotations from different IQA datasets more effectively. The source code is available at https://github.com/zwx8981/LIQE.

Related papers

Teaching LMMs for Image Quality Scoring and Interpreting [71.1335005098584]
We propose Q-SiT (Quality Scoring and Interpreting joint Teaching), a unified framework that enables image quality scoring and interpreting simultaneously. Q-SiT is the first model capable of simultaneously performing image quality scoring and interpreting tasks, along with its lightweight variant, Q-SiT-mini. Experimental results demonstrate that Q-SiT achieves strong performance in both tasks with superior generalization IQA abilities.
arXiv Detail & Related papers (2025-03-12T09:39:33Z)
CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP [19.80268944768578]
Blind dehazed image quality assessment (BDQA) aims to accurately predict the visual quality of dehazed images without any reference information. We propose to adapt Contrastive Language-Image Pre-Training (CLIP), pre-trained on large-scale image-text pairs, to the BDQA task. We show that our proposed approach, named CLIP-DQA, achieves more accurate quality predictions over existing BDQA methods.
arXiv Detail & Related papers (2025-02-03T14:12:25Z)
Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild [66.40314964321557]
We propose a novel IQA method named RichIQA to explore the rich subjective rating information beyond MOS to predict image quality in the wild. RichIQA is characterized by two key novel designs: (1) a three-stage image quality prediction network which exploits the powerful feature representation capability of the Convolutional vision Transformer (CvT) and mimics the short-term and long-term memory mechanisms of human brain. RichIQA outperforms state-of-the-art competitors on multiple large-scale in the wild IQA databases with rich subjective rating labels.
arXiv Detail & Related papers (2024-09-09T12:00:17Z)
Vision-Language Consistency Guided Multi-modal Prompt Learning for Blind AI Generated Image Quality Assessment [57.07360640784803]
We propose vision-language consistency guided multi-modal prompt learning for blind image quality assessment (AGIQA) Specifically, we introduce learnable textual and visual prompts in language and vision branches of Contrastive Language-Image Pre-training (CLIP) models. We design a text-to-image alignment quality prediction task, whose learned vision-language consistency knowledge is used to guide the optimization of the above multi-modal prompts.
arXiv Detail & Related papers (2024-06-24T13:45:31Z)
UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment [23.48816491333345]
Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal. Existing methods typically address these tasks independently due to distinct learning objectives. We propose Unified vision-language pre-training of Quality and Aesthetics (UniQA) to learn general perceptions of two tasks, thereby benefiting them simultaneously.
arXiv Detail & Related papers (2024-06-03T07:40:10Z)
Descriptive Image Quality Assessment in the Wild [25.503311093471076]
VLM-based Image Quality Assessment (IQA) seeks to describe image quality linguistically to align with human expression. We introduce Depicted image Quality Assessment in the Wild (DepictQA-Wild) Our method includes a multi-functional IQA task paradigm that encompasses both assessment and comparison tasks, brief and detailed responses, full-reference and non-reference scenarios.
arXiv Detail & Related papers (2024-05-29T07:49:15Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Learning Generalizable Perceptual Representations for Data-Efficient No-Reference Image Quality Assessment [7.291687946822539]
A major drawback of state-of-the-art NR-IQA techniques is their reliance on a large number of human annotations. We enable the learning of low-level quality features to distortion types by introducing a novel quality-aware contrastive loss. We design zero-shot quality predictions from both pathways in a completely blind setting.
arXiv Detail & Related papers (2023-12-08T05:24:21Z)
Adaptable image quality assessment using meta-reinforcement learning of task amenability [2.499394199589254]
Modern deep learning algorithms rely on subjective (human-based) image quality assessment (IQA) To predict task amenability, an IQA agent is trained using reinforcement learning (RL) with a simultaneously optimised task predictor. In this work, we develop transfer learning or adaptation strategies to increase the adaptability of both the IQA agent and the task predictor.
arXiv Detail & Related papers (2021-07-31T11:29:37Z)
Task-Specific Normalization for Continual Learning of Blind Image Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA) The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability. We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score. The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z)
Continual Learning for Blind Image Quality Assessment [80.55119990128419]
Blind image quality assessment (BIQA) models fail to continually adapt to subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. We formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets.
arXiv Detail & Related papers (2021-02-19T03:07:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.