Related papers: Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

URL: http://arxiv.org/abs/2305.09353v1
Date: Tue, 16 May 2023 11:17:54 GMT
Title: Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token
Authors: Jinsong Shi, Pan Gao, Aljosa Smolic
Abstract summary: No-reference image quality assessment (NR-IQA) has gained increasing attention recently. We propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token. Our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases.
Score: 19.67014524146261
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image quality assessment is a fundamental problem in the field of image processing, and due to the lack of reference images in most practical scenarios, no-reference image quality assessment (NR-IQA), has gained increasing attention recently. With the development of deep learning technology, many deep neural network-based NR-IQA methods have been developed, which try to learn the image quality based on the understanding of database information. Currently, Transformer has achieved remarkable progress in various vision tasks. Since the characteristics of the attention mechanism in Transformer fit the global perceptual impact of artifacts perceived by a human, Transformer is thus well suited for image quality assessment tasks. In this paper, we propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token. Specifically, we firstly generate the predicted error map by pre-training one model consisting of a Transformer encoder and decoder, in which the objective difference between the distorted and the reference images is used as supervision. Then, we freeze the parameters of the pre-trained model and design another branch using the vision Transformer to extract the perceptual quality token for feature fusion with the predicted error map. Finally, the fused features are regressed to the final image quality score. Extensive experiments have shown that our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases. Moreover, the attentional map extracted by the perceptual quality token also does conform to the characteristics of the human visual system.

Related papers

Scene Perceived Image Perceptual Score (SPIPS): combining global and local perception for image quality assessment [0.0]
We propose a novel IQA approach that bridges the gap between deep learning methods and human perception. Our model disentangles deep features into high-level semantic information and low-level perceptual details, treating each stream separately. This hybrid design enables the model to assess both global context and intricate image details, better reflecting the human visual process.
arXiv Detail & Related papers (2025-04-24T04:06:07Z)
PIGUIQA: A Physical Imaging Guided Perceptual Framework for Underwater Image Quality Assessment [59.9103803198087]
We propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA) By leveraging underwater radiative transfer theory, we integrate physics-based imaging estimations to establish quantitative metrics for these distortions. The proposed model accurately predicts image quality scores and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-12-20T03:31:45Z)
Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment [17.04649536069553]
No-reference image quality assessment is a challenging domain that addresses estimating image quality without the original reference. We introduce an improved mechanism to extract local and non-local information from images via different transformer encoders and CNNs. A self-consistency approach to self-supervision is presented, explicitly addressing the degradation of no-reference image quality assessment (NR-IQA) models.
arXiv Detail & Related papers (2024-09-11T09:08:43Z)
DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer [23.70791030264281]
Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images. We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms.
arXiv Detail & Related papers (2024-06-13T23:11:25Z)
Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA) We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it. We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z)
Transformer-based No-Reference Image Quality Assessment via Supervised Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA. We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels. To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability. Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z)
Helping Visually Impaired People Take Better Quality Pictures [52.03016269364854]
We develop tools to help visually impaired users minimize occurrences of common technical distortions. We also create a prototype feedback system that helps to guide users to mitigate quality issues.
arXiv Detail & Related papers (2023-05-14T04:37:53Z)
Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics. By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information. One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z)
Textural-Structural Joint Learning for No-Reference Super-Resolution Image Quality Assessment [59.91741119995321]
We develop a dual stream network to jointly explore the textural and structural information for quality prediction, dubbed TSNet. By mimicking the human vision system (HVS) that pays more attention to the significant areas of the image, we develop the spatial attention mechanism to make the visual-sensitive areas more distinguishable. Experimental results show the proposed TSNet predicts the visual quality more accurate than the state-of-the-art IQA methods, and demonstrates better consistency with the human's perspective.
arXiv Detail & Related papers (2022-05-27T09:20:06Z)
Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z)
No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency [38.88541492121366]
The goal of No-Reference Image Quality Assessment (NR-IQA) is to estimate the perceptual image quality in accordance with subjective evaluations. We propose a novel model to address the NR-IQA task by leveraging a hybrid approach that benefits from Convolutional Neural Networks (CNNs) and self-attention mechanism in Transformers.
arXiv Detail & Related papers (2021-08-16T02:07:08Z)
Perceptual Image Quality Assessment with Transformers [4.005576542371173]
We propose an image quality transformer (IQT) that successfully applies a transformer architecture to a perceptual full-reference image quality assessment task. We extract the perceptual feature representations from each of input images using a convolutional neural network backbone. The proposed IQT was ranked first among 13 participants in the NTIRE 2021 perceptual image quality assessment challenge.
arXiv Detail & Related papers (2021-04-30T02:45:29Z)
Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild [98.48284827503409]
We develop a textitunified BIQA model and an approach of training it for both synthetic and realistic distortions. We employ the fidelity loss to optimize a deep neural network for BIQA over a large number of such image pairs. Experiments on six IQA databases show the promise of the learned method in blindly assessing image quality in the laboratory and wild.
arXiv Detail & Related papers (2020-05-28T13:35:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.