Blind Image Quality Assessment via Transformer Predicted Error Map and
Perceptual Quality Token
- URL: http://arxiv.org/abs/2305.09353v1
- Date: Tue, 16 May 2023 11:17:54 GMT
- Title: Blind Image Quality Assessment via Transformer Predicted Error Map and
Perceptual Quality Token
- Authors: Jinsong Shi, Pan Gao, Aljosa Smolic
- Abstract summary: No-reference image quality assessment (NR-IQA) has gained increasing attention recently.
We propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token.
Our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases.
- Score: 19.67014524146261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image quality assessment is a fundamental problem in the field of image
processing, and due to the lack of reference images in most practical
scenarios, no-reference image quality assessment (NR-IQA), has gained
increasing attention recently. With the development of deep learning
technology, many deep neural network-based NR-IQA methods have been developed,
which try to learn the image quality based on the understanding of database
information. Currently, Transformer has achieved remarkable progress in various
vision tasks. Since the characteristics of the attention mechanism in
Transformer fit the global perceptual impact of artifacts perceived by a human,
Transformer is thus well suited for image quality assessment tasks. In this
paper, we propose a Transformer based NR-IQA model using a predicted objective
error map and perceptual quality token. Specifically, we firstly generate the
predicted error map by pre-training one model consisting of a Transformer
encoder and decoder, in which the objective difference between the distorted
and the reference images is used as supervision. Then, we freeze the parameters
of the pre-trained model and design another branch using the vision Transformer
to extract the perceptual quality token for feature fusion with the predicted
error map. Finally, the fused features are regressed to the final image quality
score. Extensive experiments have shown that our proposed method outperforms
the current state-of-the-art in both authentic and synthetic image databases.
Moreover, the attentional map extracted by the perceptual quality token also
does conform to the characteristics of the human visual system.
Related papers
- Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment [17.04649536069553]
No-reference image quality assessment is a challenging domain that addresses estimating image quality without the original reference.
We introduce an improved mechanism to extract local and non-local information from images via different transformer encoders and CNNs.
A self-consistency approach to self-supervision is presented, explicitly addressing the degradation of no-reference image quality assessment (NR-IQA) models.
arXiv Detail & Related papers (2024-09-11T09:08:43Z) - DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer [23.70791030264281]
Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images.
We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms.
arXiv Detail & Related papers (2024-06-13T23:11:25Z) - Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA)
We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it.
We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z) - Transformer-based No-Reference Image Quality Assessment via Supervised
Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA.
We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels.
To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability.
Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z) - Helping Visually Impaired People Take Better Quality Pictures [52.03016269364854]
We develop tools to help visually impaired users minimize occurrences of common technical distortions.
We also create a prototype feedback system that helps to guide users to mitigate quality issues.
arXiv Detail & Related papers (2023-05-14T04:37:53Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - Textural-Structural Joint Learning for No-Reference Super-Resolution
Image Quality Assessment [59.91741119995321]
We develop a dual stream network to jointly explore the textural and structural information for quality prediction, dubbed TSNet.
By mimicking the human vision system (HVS) that pays more attention to the significant areas of the image, we develop the spatial attention mechanism to make the visual-sensitive areas more distinguishable.
Experimental results show the proposed TSNet predicts the visual quality more accurate than the state-of-the-art IQA methods, and demonstrates better consistency with the human's perspective.
arXiv Detail & Related papers (2022-05-27T09:20:06Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - No-Reference Image Quality Assessment via Transformers, Relative
Ranking, and Self-Consistency [38.88541492121366]
The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations.
We propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers.
arXiv Detail & Related papers (2021-08-16T02:07:08Z) - Perceptual Image Quality Assessment with Transformers [4.005576542371173]
We propose an image quality transformer (IQT) that successfully applies a transformer architecture to a perceptual full-reference image quality assessment task.
We extract the perceptual feature representations from each of input images using a convolutional neural network backbone.
The proposed IQT was ranked first among 13 participants in the NTIRE 2021 perceptual image quality assessment challenge.
arXiv Detail & Related papers (2021-04-30T02:45:29Z) - Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and
Wild [98.48284827503409]
We develop a textitunified BIQA model and an approach of training it for both synthetic and realistic distortions.
We employ the fidelity loss to optimize a deep neural network for BIQA over a large number of such image pairs.
Experiments on six IQA databases show the promise of the learned method in blindly assessing image quality in the laboratory and wild.
arXiv Detail & Related papers (2020-05-28T13:35:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.