Perceptual Image Quality Assessment with Transformers
- URL: http://arxiv.org/abs/2104.14730v1
- Date: Fri, 30 Apr 2021 02:45:29 GMT
- Title: Perceptual Image Quality Assessment with Transformers
- Authors: Manri Cheon, Sung-Jun Yoon, Byungyeon Kang, Junwoo Lee
- Abstract summary: We propose an image quality transformer (IQT) that successfully applies a transformer architecture to a perceptual full-reference image quality assessment task.
We extract the perceptual feature representations from each of input images using a convolutional neural network backbone.
The proposed IQT was ranked first among 13 participants in the NTIRE 2021 perceptual image quality assessment challenge.
- Score: 4.005576542371173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an image quality transformer (IQT) that
successfully applies a transformer architecture to a perceptual full-reference
image quality assessment (IQA) task. Perceptual representation becomes more
important in image quality assessment. In this context, we extract the
perceptual feature representations from each of input images using a
convolutional neural network (CNN) backbone. The extracted feature maps are fed
into the transformer encoder and decoder in order to compare a reference and
distorted images. Following an approach of the transformer-based vision models,
we use extra learnable quality embedding and position embedding. The output of
the transformer is passed to a prediction head in order to predict a final
quality score. The experimental results show that our proposed model has an
outstanding performance for the standard IQA datasets. For a large-scale IQA
dataset containing output images of generative model, our model also shows the
promising results. The proposed IQT was ranked first among 13 participants in
the NTIRE 2021 perceptual image quality assessment challenge. Our work will be
an opportunity to further expand the approach for the perceptual IQA task.
Related papers
- DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
Blind image quality assessment (IQA) in the wild presents significant challenges.
Given the difficulty in collecting large-scale training data, leveraging limited data to develop a model with strong generalization remains an open problem.
Motivated by the robust image perception capabilities of pre-trained text-to-image (T2I) diffusion models, we propose a novel IQA method, diffusion priors-based IQA.
arXiv Detail & Related papers (2024-05-30T12:32:35Z) - Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA)
We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it.
We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z) - Transformer-based No-Reference Image Quality Assessment via Supervised
Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA.
We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels.
To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability.
Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z) - Blind Image Quality Assessment via Transformer Predicted Error Map and
Perceptual Quality Token [19.67014524146261]
No-reference image quality assessment (NR-IQA) has gained increasing attention recently.
We propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token.
Our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases.
arXiv Detail & Related papers (2023-05-16T11:17:54Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - Visual Mechanisms Inspired Efficient Transformers for Image and Video
Quality Assessment [5.584060970507507]
Perceptual mechanisms in the human visual system play a crucial role in the generation of quality perception.
This paper proposes a general framework for no-reference visual quality assessment using efficient windowed transformer architectures.
arXiv Detail & Related papers (2022-03-28T07:55:11Z) - Towards End-to-End Image Compression and Analysis with Transformers [99.50111380056043]
We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application.
We aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer.
Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.
arXiv Detail & Related papers (2021-12-17T03:28:14Z) - Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - No-Reference Image Quality Assessment via Transformers, Relative
Ranking, and Self-Consistency [38.88541492121366]
The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations.
We propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers.
arXiv Detail & Related papers (2021-08-16T02:07:08Z) - MUSIQ: Multi-scale Image Quality Transformer [22.908901641767688]
Current state-of-the-art IQA methods are based on convolutional neural networks (CNNs)
We design a multi-scale image quality Transformer (MUSIQ) to process native resolution images with varying sizes and aspect ratios.
With a multi-scale image representation, our proposed method can capture image quality at different granularities.
arXiv Detail & Related papers (2021-08-12T23:36:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.