Related papers: Invariance of deep image quality metrics to affine transformations

Invariance of deep image quality metrics to affine transformations

URL: http://arxiv.org/abs/2407.17927v1
Date: Thu, 25 Jul 2024 10:24:54 GMT
Title: Invariance of deep image quality metrics to affine transformations
Authors: Nuria Alabau-Bosque, Paula Daudén-Oliver, Jorge Vila-Tomás, Valero Laparra, Jesús Malo,
Abstract summary: We evaluate state-of-the-art deep image quality metrics by assessing their invariance to affine transformations. We psychophysically measure an absolute detection threshold in that common representation and express it in the physical units of each affine transform. We find that none of the state-of-the-art metrics shows human-like results under this strong test based on invisibility thresholds.
Score: 0.932065750652415
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep architectures are the current state-of-the-art in predicting subjective image quality. Usually, these models are evaluated according to their ability to correlate with human opinion in databases with a range of distortions that may appear in digital media. However, these oversee affine transformations which may represent better the changes in the images actually happening in natural conditions. Humans can be particularly invariant to these natural transformations, as opposed to the digital ones. In this work, we evaluate state-of-the-art deep image quality metrics by assessing their invariance to affine transformations, specifically: rotation, translation, scaling, and changes in spectral illumination. We propose a methodology to assign invisibility thresholds for any perceptual metric. This methodology involves transforming the distance measured by an arbitrary metric to a common distance representation based on available subjectively rated databases. We psychophysically measure an absolute detection threshold in that common representation and express it in the physical units of each affine transform for each metric. By doing so, we allow the analyzed metrics to be directly comparable with actual human thresholds. We find that none of the state-of-the-art metrics shows human-like results under this strong test based on invisibility thresholds. This means that tuning the models exclusively to predict the visibility of generic distortions may disregard other properties of human vision as for instance invariances or invisibility thresholds.

Related papers

A Meaningful Perturbation Metric for Evaluating Explainability Methods [55.09730499143998]
We introduce a novel approach, which harnesses image generation models to perform targeted perturbation. Specifically, we focus on inpainting only the high-relevance pixels of an input image to modify the model's predictions while preserving image fidelity. This is in contrast to existing approaches, which often produce out-of-distribution modifications, leading to unreliable results.
arXiv Detail & Related papers (2025-04-09T11:46:41Z)
BELE: Blur Equivalent Linearized Estimator [0.8192907805418581]
This paper introduces a novel parametric model that separates perceptual effects due to strong edge degradations from those caused by texture distortions. The first is the Blur Equivalent Linearized Estimator, designed to measure blur on strong and isolated edges. The second is a Complex Peak Signal-to-Noise Ratio, which evaluates distortions affecting texture regions.
arXiv Detail & Related papers (2025-03-01T14:19:08Z)
Perceptual Scales Predicted by Fisher Information Metrics [0.6906005491572401]
Perception is often viewed as a process that transforms physical variables, external to an observer, into internal psychological variables. The perceptual scale can be deduced from psychophysical measurements that consist in comparing the relative differences between stimuli. Here, we demonstrate the value of measuring the perceptual scale of classical (spatial frequency, orientation) and less classical physical variables.
arXiv Detail & Related papers (2023-10-18T07:31:47Z)
Subjective Face Transform using Human First Impressions [5.026535087391025]
This work uses generative models to find semantically meaningful edits to a face image that change perceived attributes. We train on real and synthetic faces, evaluate for in-domain and out-of-domain images using predictive models and human ratings.
arXiv Detail & Related papers (2023-09-27T03:21:07Z)
Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? [86.58989831070426]
We study the faithfulness of hand-crafted metrics to human perception of privacy information from reconstructed images. We propose a learning-based measure called SemSim to evaluate the Semantic Similarity between the original and reconstructed images.
arXiv Detail & Related papers (2023-09-22T17:58:04Z)
Learning Transformations To Reduce the Geometric Shift in Object Detection [60.20931827772482]
We tackle geometric shifts emerging from variations in the image capture process. We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts. We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change.
arXiv Detail & Related papers (2023-01-13T11:55:30Z)
Shift-tolerant Perceptual Similarity Metric [5.326626090397465]
Existing perceptual similarity metrics assume an image and its reference are well aligned. This paper studies the effect of small misalignment, specifically a small shift between the input and reference image, on existing metrics. We develop a new deep neural network-based perceptual similarity metric.
arXiv Detail & Related papers (2022-07-27T17:55:04Z)
Unsupervised Learning Facial Parameter Regressor for Action Unit Intensity Estimation via Differentiable Renderer [51.926868759681014]
We present a framework to predict the facial parameters based on a bone-driven face model (BDFM) under different views. The proposed framework consists of a feature extractor, a generator, and a facial parameter regressor.
arXiv Detail & Related papers (2020-08-20T09:49:13Z)
Shift Equivariance in Object Detection [8.03777903218606]
Recent works have shown that CNN-based classifiers are not shift invariant. It is unclear to what extent this could impact object detection, mainly because of the architectural differences between the two and the dimensionality of the prediction space of modern detectors. We propose an evaluation metric, built upon a greedy search of the lower and upper bounds of the mean average precision on a shifted image set.
arXiv Detail & Related papers (2020-08-13T10:02:02Z)
Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity. We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration. State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
Learning Disentangled Representations with Latent Variation Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations. Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs. We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
NiLBS: Neural Inverse Linear Blend Skinning [59.22647012489496]
We introduce a method to invert the deformations undergone via traditional skinning techniques via a neural network parameterized by pose. The ability to invert these deformations allows values (e.g., distance function, signed distance function, occupancy) to be pre-computed at rest pose, and then efficiently queried when the character is deformed.
arXiv Detail & Related papers (2020-04-06T20:46:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.