Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective
- URL: http://arxiv.org/abs/2303.14968v1
- Date: Mon, 27 Mar 2023 07:58:09 GMT
- Title: Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective
- Authors: Weixia Zhang and Guangtao Zhai and Ying Wei and Xiaokang Yang and Kede
Ma
- Abstract summary: Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
- Score: 93.56647950778357
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We aim at advancing blind image quality assessment (BIQA), which predicts the
human perception of image quality without any reference information. We develop
a general and automated multitask learning scheme for BIQA to exploit auxiliary
knowledge from other tasks, in a way that the model parameter sharing and the
loss weighting are determined automatically. Specifically, we first describe
all candidate label combinations (from multiple tasks) using a textual
template, and compute the joint probability from the cosine similarities of the
visual-textual embeddings. Predictions of each task can be inferred from the
joint distribution, and optimized by carefully designed loss functions. Through
comprehensive experiments on learning three tasks - BIQA, scene classification,
and distortion type identification, we verify that the proposed BIQA method 1)
benefits from the scene classification and distortion type identification tasks
and outperforms the state-of-the-art on multiple IQA datasets, 2) is more
robust in the group maximum differentiation competition, and 3) realigns the
quality annotations from different IQA datasets more effectively. The source
code is available at https://github.com/zwx8981/LIQE.
Related papers
- Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild [66.40314964321557]
We propose a novel IQA method named RichIQA to explore the rich subjective rating information beyond MOS to predict image quality in the wild.
RichIQA is characterized by two key novel designs: (1) a three-stage image quality prediction network which exploits the powerful feature representation capability of the Convolutional vision Transformer (CvT) and mimics the short-term and long-term memory mechanisms of human brain.
RichIQA outperforms state-of-the-art competitors on multiple large-scale in the wild IQA databases with rich subjective rating labels.
arXiv Detail & Related papers (2024-09-09T12:00:17Z) - Vision-Language Consistency Guided Multi-modal Prompt Learning for Blind AI Generated Image Quality Assessment [57.07360640784803]
We propose vision-language consistency guided multi-modal prompt learning for blind image quality assessment (AGIQA)
Specifically, we introduce learnable textual and visual prompts in language and vision branches of Contrastive Language-Image Pre-training (CLIP) models.
We design a text-to-image alignment quality prediction task, whose learned vision-language consistency knowledge is used to guide the optimization of the above multi-modal prompts.
arXiv Detail & Related papers (2024-06-24T13:45:31Z) - UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment [23.48816491333345]
Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal.
Existing methods typically address these tasks independently due to distinct learning objectives.
We propose Unified vision-language pre-training of Quality and Aesthetics (UniQA) to learn general perceptions of two tasks, thereby benefiting them simultaneously.
arXiv Detail & Related papers (2024-06-03T07:40:10Z) - Descriptive Image Quality Assessment in the Wild [25.503311093471076]
VLM-based Image Quality Assessment (IQA) seeks to describe image quality linguistically to align with human expression.
We introduce Depicted image Quality Assessment in the Wild (DepictQA-Wild)
Our method includes a multi-functional IQA task paradigm that encompasses both assessment and comparison tasks, brief and detailed responses, full-reference and non-reference scenarios.
arXiv Detail & Related papers (2024-05-29T07:49:15Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Learning Generalizable Perceptual Representations for Data-Efficient
No-Reference Image Quality Assessment [7.291687946822539]
A major drawback of state-of-the-art NR-IQA techniques is their reliance on a large number of human annotations.
We enable the learning of low-level quality features to distortion types by introducing a novel quality-aware contrastive loss.
We design zero-shot quality predictions from both pathways in a completely blind setting.
arXiv Detail & Related papers (2023-12-08T05:24:21Z) - Adaptable image quality assessment using meta-reinforcement learning of
task amenability [2.499394199589254]
Modern deep learning algorithms rely on subjective (human-based) image quality assessment (IQA)
To predict task amenability, an IQA agent is trained using reinforcement learning (RL) with a simultaneously optimised task predictor.
In this work, we develop transfer learning or adaptation strategies to increase the adaptability of both the IQA agent and the task predictor.
arXiv Detail & Related papers (2021-07-31T11:29:37Z) - Task-Specific Normalization for Continual Learning of Blind Image
Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA)
The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability.
We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score.
The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z) - Continual Learning for Blind Image Quality Assessment [80.55119990128419]
Blind image quality assessment (BIQA) models fail to continually adapt to subpopulation shift.
Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets.
We formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets.
arXiv Detail & Related papers (2021-02-19T03:07:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.