Related papers: Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

URL: http://arxiv.org/abs/2409.00749v1
Date: Sun, 1 Sep 2024 15:26:11 GMT
Title: Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency
Authors: Wei Sun, Weixia Zhang, Yuqin Cao, Linhan Cao, Jun Jia, Zijian Chen, Zicheng Zhang, Xiongkuo Min, Guangtao Zhai,
Abstract summary: We design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives. aesthetic features are extracted from low-resolution images downsampled from the UHD ones. Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images. The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions.
Score: 51.36674160287799
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: UHD images, typically with resolutions equal to or higher than 4K, pose a significant challenge for efficient image quality assessment (IQA) algorithms, as adopting full-resolution images as inputs leads to overwhelming computational complexity and commonly used pre-processing methods like resizing or cropping may cause substantial loss of detail. To address this problem, we design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception. Specifically, aesthetic features are extracted from low-resolution images downsampled from the UHD ones, which lose high-frequency texture information but still preserve the global aesthetics characteristics. Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images based on the grid mini-patch sampling strategy. The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions. We adopt the Swin Transformer Tiny as the backbone networks to extract features from these three perspectives. The extracted features are concatenated and regressed into quality scores by a two-layer multi-layer perceptron (MLP) network. We employ the mean square error (MSE) loss to optimize prediction accuracy and the fidelity loss to optimize prediction monotonicity. Experimental results show that the proposed model achieves the best performance on the UHD-IQA dataset while maintaining the lowest computational complexity, demonstrating its effectiveness and efficiency. Moreover, the proposed model won first prize in ECCV AIM 2024 UHD-IQA Challenge. The code is available at https://github.com/sunwei925/UIQA.

Related papers

PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition [54.642714288448744]
PETALface is the first work leveraging the powers of PEFT for low resolution face recognition. We introduce two low-rank adaptation modules to the backbone, with weights adjusted based on the input image quality to account for the difference in quality for the gallery and probe images. Experiments demonstrate that the proposed method outperforms full fine-tuning on low-resolution datasets while preserving performance on high-resolution and mixed-quality datasets.
arXiv Detail & Related papers (2024-12-10T18:59:45Z)
Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration [11.866565346920781]
D2Net enables direct full-resolution inference on Ultra-high-definition (UHD) images. We show that our model achieves better quantitative and qualitative results than state-of-the-art methods.
arXiv Detail & Related papers (2024-11-10T13:05:36Z)
Perceptual-Distortion Balanced Image Super-Resolution is a Multi-Objective Optimization Problem [23.833099288826045]
Training Single-Image Super-Resolution (SISR) models using pixel-based regression losses can achieve high distortion metrics scores. However, they often results in blurry images due to insufficient recovery of high-frequency details. We propose a novel method that incorporates Multi-Objective Optimization (MOO) into the training process of SISR models to balance perceptual quality and distortion.
arXiv Detail & Related papers (2024-09-05T02:14:04Z)
DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
Blind image quality assessment (IQA) in the wild presents significant challenges. Given the difficulty in collecting large-scale training data, leveraging limited data to develop a model with strong generalization remains an open problem. Motivated by the robust image perception capabilities of pre-trained text-to-image (T2I) diffusion models, we propose a novel IQA method, diffusion priors-based IQA.
arXiv Detail & Related papers (2024-05-30T12:32:35Z)
Dual-Branch Network for Portrait Image Quality Assessment [76.27716058987251]
We introduce a dual-branch network for portrait image quality assessment (PIQA) We utilize two backbone networks (textiti.e., Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it. We leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features.
arXiv Detail & Related papers (2024-05-14T12:43:43Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Transformer-based No-Reference Image Quality Assessment via Supervised Contrastive Learning [36.695247860715874]
We propose a novel Contrastive Learning (SCL) and Transformer-based NR-IQA model SaTQA. We first train a model on a large-scale synthetic dataset by SCL to extract degradation features of images with various distortion types and levels. To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability. Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets
arXiv Detail & Related papers (2023-12-12T06:01:41Z)
Deep Neural Network for Blind Visual Quality Assessment of 4K Content [37.70643043547502]
Existing blind image quality assessment (BIQA) methods are not suitable for the original and upscaled 4K contents. We propose a deep learning-based BIQA model for 4K content, which on one hand can recognize true and pseudo 4K content and on the other hand can evaluate their perceptual visual quality. The proposed model is trained through the multi-task learning manner and we introduce an uncertainty principle to balance the losses of the classification and regression tasks.
arXiv Detail & Related papers (2022-06-09T09:10:54Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution [10.960291115491504]
We generate an artifact-free high-resolution image from a low-resolution one compressed with an arbitrary quality factor. A context-aware joint CAR and SR neural network (CAJNN) integrates both local and non-local features to solve CAR and SR in one-stage. A deep reconstruction network is adopted to predict high quality and high-resolution images.
arXiv Detail & Related papers (2020-10-18T04:17:08Z)
Gated Fusion Network for Degraded Image Super Resolution [78.67168802945069]
We propose a dual-branch convolutional neural network to extract base features and recovered features separately. By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process.
arXiv Detail & Related papers (2020-03-02T13:28:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.