Multitask Auxiliary Network for Perceptual Quality Assessment of Non-Uniformly Distorted Omnidirectional Images
- URL: http://arxiv.org/abs/2501.11512v1
- Date: Mon, 20 Jan 2025 14:41:29 GMT
- Title: Multitask Auxiliary Network for Perceptual Quality Assessment of Non-Uniformly Distorted Omnidirectional Images
- Authors: Jiebin Yan, Jiale Rao, Junjie Chen, Ziwen Tan, Weide Liu, Yuming Fang,
- Abstract summary: We propose a multitask auxiliary network for non-uniformly distorted omnidirectional images.
The proposed network mainly consists of three parts: a backbone for extracting multiscale features from the viewport sequence, a multitask feature selection module for dynamically allocating specific features to different tasks, and auxiliary sub-networks for guiding the proposed model.
Experiments conducted on two large-scale OIQA databases demonstrate that the proposed model outperforms other state-of-the-art OIQA metrics.
- Score: 31.032202503593897
- License:
- Abstract: Omnidirectional image quality assessment (OIQA) has been widely investigated in the past few years and achieved much success. However, most of existing studies are dedicated to solve the uniform distortion problem in OIQA, which has a natural gap with the non-uniform distortion problem, and their ability in capturing non-uniform distortion is far from satisfactory. To narrow this gap, in this paper, we propose a multitask auxiliary network for non-uniformly distorted omnidirectional images, where the parameters are optimized by jointly training the main task and other auxiliary tasks. The proposed network mainly consists of three parts: a backbone for extracting multiscale features from the viewport sequence, a multitask feature selection module for dynamically allocating specific features to different tasks, and auxiliary sub-networks for guiding the proposed model to capture local distortion and global quality change. Extensive experiments conducted on two large-scale OIQA databases demonstrate that the proposed model outperforms other state-of-the-art OIQA metrics, and these auxiliary sub-networks contribute to improve the performance of the proposed model. The source code is available at https://github.com/RJL2000/MTAOIQA.
Related papers
- Cascaded Multi-Scale Attention for Enhanced Multi-Scale Feature Extraction and Interaction with Low-Resolution Images [20.140898354987353]
We propose a new attention mechanism, named cascaded multi-scale attention (CMSA), to handle low-resolution inputs effectively.
This architecture allows for the effective handling of features across different scales, enhancing the model's ability to perform tasks such as human pose estimation.
Our experimental results show that the proposed method outperforms existing state-of-the-art methods in these areas with fewer parameters.
arXiv Detail & Related papers (2024-12-03T06:23:19Z) - Multi-task Feature Enhancement Network for No-Reference Image Quality Assessment [4.4150617622399055]
Multi-task strategies based No-Reference Image Quality Assessment (NR-IQA) methods encounter several challenges.
Our framework consists of three key components: a high-frequency extraction network, a quality estimation network, and a distortion-aware network.
Empirical results from five standard IQA databases confirm that our method achieves high performance and also exhibits robust generalization ability.
arXiv Detail & Related papers (2024-11-12T05:10:32Z) - Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity [55.399230250413986]
We propose a Quality-Aware Feature Matching IQA Metric (QFM-IQM) to remove harmful semantic noise features from the upstream task.
Our approach achieves superior performance to the state-of-the-art NR-IQA methods on eight standard IQA datasets.
arXiv Detail & Related papers (2023-12-11T06:50:27Z) - Multi-task Image Restoration Guided By Robust DINO Features [88.74005987908443]
We propose mboxtextbfDINO-IR, a multi-task image restoration approach leveraging robust features extracted from DINOv2.
We first propose a pixel-semantic fusion (PSF) module to dynamically fuse DINOV2's shallow features.
By formulating these modules into a unified deep model, we propose a DINO perception contrastive loss to constrain the model training.
arXiv Detail & Related papers (2023-12-04T06:59:55Z) - Local Distortion Aware Efficient Transformer Adaptation for Image
Quality Assessment [62.074473976962835]
We show that with proper injection of local distortion features, a larger pretrained and fixed foundation model performs better in IQA tasks.
Specifically, for the lack of local distortion structure and inductive bias of vision transformer (ViT), we use another pretrained convolution neural network (CNN)
We propose a local distortion extractor to obtain local distortion features from the pretrained CNN and a local distortion injector to inject the local distortion features into ViT.
arXiv Detail & Related papers (2023-08-23T08:41:21Z) - Assessor360: Multi-sequence Network for Blind Omnidirectional Image
Quality Assessment [50.82681686110528]
Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs)
The quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipeline lacks the modeling of the observer's browsing process.
We propose a novel multi-sequence network for BOIQA called Assessor360, which is derived from the realistic multi-assessor ODI quality assessment procedure.
arXiv Detail & Related papers (2023-05-18T13:55:28Z) - Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z) - Efficient and Accurate Multi-scale Topological Network for Single Image
Dehazing [31.543771270803056]
In this paper, we pay attention to the feature extraction and utilization of the input image itself.
We propose a Multi-scale Topological Network (MSTN) to fully explore the features at different scales.
Meanwhile, we design a Multi-scale Feature Fusion Module (MFFM) and an Adaptive Feature Selection Module (AFSM) to achieve the selection and fusion of features at different scales.
arXiv Detail & Related papers (2021-02-24T08:53:14Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - Pareto-Optimal Bit Allocation for Collaborative Intelligence [39.11380888887304]
Collaborative intelligence (CI) has emerged as a promising framework for deployment of Artificial Intelligence (AI)-based services on mobile/edge devices.
In this paper, we study bit allocation for feature coding in multi-stream CI systems.
arXiv Detail & Related papers (2020-09-25T20:48:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.