TOPIQ: A Top-down Approach from Semantics to Distortions for Image
Quality Assessment
- URL: http://arxiv.org/abs/2308.03060v1
- Date: Sun, 6 Aug 2023 09:08:37 GMT
- Title: TOPIQ: A Top-down Approach from Semantics to Distortions for Image
Quality Assessment
- Authors: Chaofeng Chen, Jiadi Mo, Jingwen Hou, Haoning Wu, Liang Liao, Wenxiu
Sun, Qiong Yan, Weisi Lin
- Abstract summary: Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks.
We propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions.
A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features.
- Score: 53.72721476803585
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image Quality Assessment (IQA) is a fundamental task in computer vision that
has witnessed remarkable progress with deep neural networks. Inspired by the
characteristics of the human visual system, existing methods typically use a
combination of global and local representations (\ie, multi-scale features) to
achieve superior performance. However, most of them adopt simple linear fusion
of multi-scale features, and neglect their possibly complex relationship and
interaction. In contrast, humans typically first form a global impression to
locate important regions and then focus on local details in those regions. We
therefore propose a top-down approach that uses high-level semantics to guide
the IQA network to focus on semantically important local distortion regions,
named as \emph{TOPIQ}. Our approach to IQA involves the design of a heuristic
coarse-to-fine network (CFANet) that leverages multi-scale features and
progressively propagates multi-level semantic information to low-level
representations in a top-down manner. A key component of our approach is the
proposed cross-scale attention mechanism, which calculates attention maps for
lower level features guided by higher level features. This mechanism emphasizes
active semantic regions for low-level distortions, thereby improving
performance. CFANet can be used for both Full-Reference (FR) and No-Reference
(NR) IQA. We use ResNet50 as its backbone and demonstrate that CFANet achieves
better or competitive performance on most public FR and NR benchmarks compared
with state-of-the-art methods based on vision transformers, while being much
more efficient (with only ${\sim}13\%$ FLOPS of the current best FR method).
Codes are released at \url{https://github.com/chaofengc/IQA-PyTorch}.
Related papers
- Context-Semantic Quality Awareness Network for Fine-Grained Visual Categorization [30.92656780805478]
We propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for fine-grained visual categorization (FGVC)
To model the spatial contextual relationship between rich part descriptors and global semantics, we develop a novel multi-part and multi-scale cross-attention (MPMSCA) module.
We also propose a generic multi-level semantic quality evaluation module (MLSQE) to progressively supervise and enhance hierarchical semantics from different levels of the backbone network.
arXiv Detail & Related papers (2024-03-15T13:40:44Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - Feedback Pyramid Attention Networks for Single Image Super-Resolution [37.58180059860872]
We propose feedback pyramid attention networks (FPAN) to fully exploit the mutual dependencies of features.
In our method, the output of each layer in the first stage is also used as the input of the corresponding layer in the next state to re-update the previous low-level filters.
We introduce a pyramid non-local structure to model global contextual information in different scales and improve the discriminative representation of the network.
arXiv Detail & Related papers (2021-06-13T11:32:53Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Weakly Supervised Attention Pyramid Convolutional Neural Network for
Fine-Grained Visual Classification [71.96618723152487]
We introduce Attention Pyramid Convolutional Neural Network (AP-CNN)
AP-CNN learns both high-level semantic and low-level detailed feature representation.
It can be trained end-to-end, without the need of additional bounding box/part annotations.
arXiv Detail & Related papers (2020-02-09T12:33:23Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.