MSLIQA: Enhancing Learning Representations for Image Quality Assessment through Multi-Scale Learning
- URL: http://arxiv.org/abs/2408.16879v2
- Date: Fri, 6 Sep 2024 17:17:16 GMT
- Title: MSLIQA: Enhancing Learning Representations for Image Quality Assessment through Multi-Scale Learning
- Authors: Nasim Jamshidi Avanaki, Abhijay Ghildyal, Nabajeet Barman, Saman Zadtootaghaj,
- Abstract summary: We improve the performance of a generic lightweight NR-IQA model by introducing a novel augmentation strategy.
This augmentation strategy enables the network to better discriminate between different distortions in various parts of the image by zooming in and out.
The inclusion of test-time augmentation further enhances performance, making our lightweight network's results comparable to the current state-of-the-art models.
- Score: 6.074775040047959
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: No-Reference Image Quality Assessment (NR-IQA) remains a challenging task due to the diversity of distortions and the lack of large annotated datasets. Many studies have attempted to tackle these challenges by developing more accurate NR-IQA models, often employing complex and computationally expensive networks, or by bridging the domain gap between various distortions to enhance performance on test datasets. In our work, we improve the performance of a generic lightweight NR-IQA model by introducing a novel augmentation strategy that boosts its performance by almost 28\%. This augmentation strategy enables the network to better discriminate between different distortions in various parts of the image by zooming in and out. Additionally, the inclusion of test-time augmentation further enhances performance, making our lightweight network's results comparable to the current state-of-the-art models, simply through the use of augmentations.
Related papers
- GFN: A graph feedforward network for resolution-invariant reduced operator learning in multifidelity applications [0.0]
This work presents a novel resolution-invariant model order reduction strategy for multifidelity applications.
We base our architecture on a novel neural network layer developed in this work, the graph feedforward network.
We exploit the method's capability of training and testing on different mesh sizes in an autoencoder-based reduction strategy for parametrised partial differential equations.
arXiv Detail & Related papers (2024-06-05T18:31:37Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - HANet: A Hierarchical Attention Network for Change Detection With Bitemporal Very-High-Resolution Remote Sensing Images [6.890268321645873]
We propose a progressive foreground-balanced sampling strategy on the basis of not adding change information.
This strategy helps the model accurately learn the features of the changed pixels during the early training process.
We also design a discriminative Siamese network, hierarchical attention network (HANet), which can integrate multiscale features and refine detailed features.
arXiv Detail & Related papers (2024-04-14T08:01:27Z) - Diffusion Model Based Visual Compensation Guidance and Visual Difference
Analysis for No-Reference Image Quality Assessment [82.13830107682232]
We propose a novel class of state-of-the-art (SOTA) generative model, which exhibits the capability to model intricate relationships.
We devise a new diffusion restoration network that leverages the produced enhanced image and noise-containing images.
Two visual evaluation branches are designed to comprehensively analyze the obtained high-level feature information.
arXiv Detail & Related papers (2024-02-22T09:39:46Z) - A Lightweight Parallel Framework for Blind Image Quality Assessment [7.9562077122537875]
We propose a lightweight parallel framework (LPF) for blind image quality assessment (BIQA)
First, we extract the visual features using a pre-trained feature extraction network. Furthermore, we construct a simple yet effective feature embedding network (FEN) to transform the visual features.
We present two novel self-supervised subtasks, including a sample-level category prediction task and a batch-level quality comparison task.
arXiv Detail & Related papers (2024-02-19T10:56:58Z) - PMT-IQA: Progressive Multi-task Learning for Blind Image Quality
Assessment [6.976106651278988]
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation.
Existing BIQA methods often fail to consider multi-scale distortion patterns and image content.
We propose a Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT)
arXiv Detail & Related papers (2023-01-03T16:29:17Z) - Demystify Transformers & Convolutions in Modern Image Deep Networks [82.32018252867277]
This paper aims to identify the real gains of popular convolution and attention operators through a detailed study.
We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach.
Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs.
arXiv Detail & Related papers (2022-11-10T18:59:43Z) - Hierarchical Similarity Learning for Aliasing Suppression Image
Super-Resolution [64.15915577164894]
A hierarchical image super-resolution network (HSRNet) is proposed to suppress the influence of aliasing.
HSRNet achieves better quantitative and visual performance than other works, and remits the aliasing more effectively.
arXiv Detail & Related papers (2022-06-07T14:55:32Z) - MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer
with Multi-Stage Fusion [8.338999282303755]
We propose a novel algorithm based on the Swin Transformer.
It aggregates information from both local and global features to better predict the quality.
It ranks 2nd in the no-reference track of NTIRE 2022 Perceptual Image Quality Assessment Challenge.
arXiv Detail & Related papers (2022-05-20T11:34:35Z) - Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z) - Flexible Example-based Image Enhancement with Task Adaptive Global
Feature Self-Guided Network [162.14579019053804]
We show that our model outperforms the current state of the art in learning a single enhancement mapping.
The model achieves even higher performance on learning multiple mappings simultaneously.
arXiv Detail & Related papers (2020-05-13T22:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.