MUSIQ: Multi-scale Image Quality Transformer
- URL: http://arxiv.org/abs/2108.05997v1
- Date: Thu, 12 Aug 2021 23:36:22 GMT
- Title: MUSIQ: Multi-scale Image Quality Transformer
- Authors: Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, Feng Yang
- Abstract summary: Current state-of-the-art IQA methods are based on convolutional neural networks (CNNs)
We design a multi-scale image quality Transformer (MUSIQ) to process native resolution images with varying sizes and aspect ratios.
With a multi-scale image representation, our proposed method can capture image quality at different granularities.
- Score: 22.908901641767688
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image quality assessment (IQA) is an important research topic for
understanding and improving visual experience. The current state-of-the-art IQA
methods are based on convolutional neural networks (CNNs). The performance of
CNN-based models is often compromised by the fixed shape constraint in batch
training. To accommodate this, the input images are usually resized and cropped
to a fixed shape, causing image quality degradation. To address this, we design
a multi-scale image quality Transformer (MUSIQ) to process native resolution
images with varying sizes and aspect ratios. With a multi-scale image
representation, our proposed method can capture image quality at different
granularities. Furthermore, a novel hash-based 2D spatial embedding and a scale
embedding is proposed to support the positional embedding in the multi-scale
representation. Experimental results verify that our method can achieve
state-of-the-art performance on multiple large scale IQA datasets such as
PaQ-2-PiQ, SPAQ and KonIQ-10k.
Related papers
- Q-Ground: Image Quality Grounding with Large Multi-modality Models [61.72022069880346]
We introduce Q-Ground, the first framework aimed at tackling fine-scale visual quality grounding.
Q-Ground combines large multi-modality models with detailed visual quality analysis.
Central to our contribution is the introduction of the QGround-100K dataset.
arXiv Detail & Related papers (2024-07-24T06:42:46Z) - Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA)
We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it.
We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z) - Transformer-based No-Reference Image Quality Assessment via Supervised
Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA.
We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels.
To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability.
Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z) - MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer
with Multi-Stage Fusion [8.338999282303755]
We propose a novel algorithm based on the Swin Transformer.
It aggregates information from both local and global features to better predict the quality.
It ranks 2nd in the no-reference track of NTIRE 2022 Perceptual Image Quality Assessment Challenge.
arXiv Detail & Related papers (2022-05-20T11:34:35Z) - Attentions Help CNNs See Better: Attention-based Hybrid Image Quality
Assessment Network [20.835800149919145]
Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality.
There is a performance drop when assessing distortion images generated by generative adversarial network (GAN) with seemingly realistic texture.
We propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task.
arXiv Detail & Related papers (2022-04-22T03:59:18Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem.
We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models.
Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z) - Deep Superpixel-based Network for Blind Image Quality Assessment [4.079861933099766]
The goal in a blind image quality assessment (BIQA) model is to simulate the process of evaluating images by human eyes.
We propose a deep adaptive superpixel-based network, namely DSN-IQA, to assess the quality of image based on multi-scale and superpixel segmentation.
arXiv Detail & Related papers (2021-10-13T08:26:58Z) - Multi-pooled Inception features for no-reference image quality
assessment [0.0]
We propose a new approach for image quality assessment using convolutional neural networks (CNNs)
In contrast to previous methods, we do not take patches from the input image. Instead, the input image is treated as a whole and is run through a pretrained CNN body to extract resolution-independent, multi-level deep features.
We demonstrate that our best proposal - called MultiGAP-NRIQA - is able to provide state-of-the-art results on three benchmark IQA databases.
arXiv Detail & Related papers (2020-11-10T15:09:49Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.