Learning Transformer Features for Image Quality Assessment
- URL: http://arxiv.org/abs/2112.00485v1
- Date: Wed, 1 Dec 2021 13:23:00 GMT
- Title: Learning Transformer Features for Image Quality Assessment
- Authors: Chao Zeng and Sam Kwong
- Abstract summary: We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
- Score: 53.51379676690971
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective image quality evaluation is a challenging task, which aims to
measure the quality of a given image automatically. According to the
availability of the reference images, there are Full-Reference and No-Reference
IQA tasks, respectively. Most deep learning approaches use regression from deep
features extracted by Convolutional Neural Networks. For the FR task, another
option is conducting a statistical comparison on deep features. For all these
methods, non-local information is usually neglected. In addition, the
relationship between FR and NR tasks is less explored. Motivated by the recent
success of transformers in modeling contextual information, we propose a
unified IQA framework that utilizes CNN backbone and transformer encoder to
extract features. The proposed framework is compatible with both FR and NR
modes and allows for a joint training scheme. Evaluation experiments on three
standard IQA datasets, i.e., LIVE, CSIQ and TID2013, and KONIQ-10K, show that
our proposed model can achieve state-of-the-art FR performance. In addition,
comparable NR performance is achieved in extensive experiments, and the results
show that the NR performance can be leveraged by the joint training scheme.
Related papers
- Diffusion Model Based Visual Compensation Guidance and Visual Difference
Analysis for No-Reference Image Quality Assessment [82.13830107682232]
We propose a novel class of state-of-the-art (SOTA) generative model, which exhibits the capability to model intricate relationships.
We devise a new diffusion restoration network that leverages the produced enhanced image and noise-containing images.
Two visual evaluation branches are designed to comprehensively analyze the obtained high-level feature information.
arXiv Detail & Related papers (2024-02-22T09:39:46Z) - A Lightweight Parallel Framework for Blind Image Quality Assessment [7.9562077122537875]
We propose a lightweight parallel framework (LPF) for blind image quality assessment (BIQA)
First, we extract the visual features using a pre-trained feature extraction network. Furthermore, we construct a simple yet effective feature embedding network (FEN) to transform the visual features.
We present two novel self-supervised subtasks, including a sample-level category prediction task and a batch-level quality comparison task.
arXiv Detail & Related papers (2024-02-19T10:56:58Z) - Transformer-based No-Reference Image Quality Assessment via Supervised
Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA.
We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels.
To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability.
Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z) - You Only Train Once: A Unified Framework for Both Full-Reference and No-Reference Image Quality Assessment [45.62136459502005]
We propose a network to perform full reference (FR) and no reference (NR) IQA.
We first employ an encoder to extract multi-level features from input images.
A Hierarchical Attention (HA) module is proposed as a universal adapter for both FR and NR inputs.
A Semantic Distortion Aware (SDA) module is proposed to examine feature correlations between shallow and deep layers of the encoder.
arXiv Detail & Related papers (2023-10-14T11:03:04Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - No-Reference Image Quality Assessment via Feature Fusion and Multi-Task
Learning [29.19484863898778]
Blind or no-reference image quality assessment (NR-IQA) is a fundamental, unsolved, and yet challenging problem.
We propose a simple and yet effective general-purpose no-reference (NR) image quality assessment framework based on multi-task learning.
Our model employs distortion types as well as subjective human scores to predict image quality.
arXiv Detail & Related papers (2020-06-06T05:04:10Z) - MetaIQA: Deep Meta-learning for No-Reference Image Quality Assessment [73.55944459902041]
This paper presents a no-reference IQA metric based on deep meta-learning.
We first collect a number of NR-IQA tasks for different distortions.
Then meta-learning is adopted to learn the prior knowledge shared by diversified distortions.
Extensive experiments demonstrate that the proposed metric outperforms the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-04-11T23:36:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.