From Global to Granular: Revealing IQA Model Performance via Correlation Surface
- URL: http://arxiv.org/abs/2601.21738v1
- Date: Thu, 29 Jan 2026 13:55:26 GMT
- Title: From Global to Granular: Revealing IQA Model Performance via Correlation Surface
- Authors: Baoliang Chen, Danni Huang, Hanwei Zhu, Lingyu Zhu, Wei Zhou, Shiqi Wang, Yuming Fang, Weisi Lin,
- Abstract summary: We present textbfGranularity-Modulated Correlation (GMC), which provides a structured, fine-grained analysis of IQA performance.<n>GMC includes a textbfDistribution Regulator that regularizes correlations to mitigate biases from non-uniform quality distributions.<n>Experiments on standard benchmarks show that GMC reveals performance characteristics invisible to scalar metrics, offering a more informative and reliable paradigm for analyzing, comparing, and deploying IQA models.
- Score: 83.65597122328133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Evaluation of Image Quality Assessment (IQA) models has long been dominated by global correlation metrics, such as Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank-Order Correlation Coefficient (SRCC). While widely adopted, these metrics reduce performance to a single scalar, failing to capture how ranking consistency varies across the local quality spectrum. For example, two IQA models may achieve identical SRCC values, yet one ranks high-quality images (related to high Mean Opinion Score, MOS) more reliably, while the other better discriminates image pairs with small quality/MOS differences (related to $|Δ$MOS$|$). Such complementary behaviors are invisible under global metrics. Moreover, SRCC and PLCC are sensitive to test-sample quality distributions, yielding unstable comparisons across test sets. To address these limitations, we propose \textbf{Granularity-Modulated Correlation (GMC)}, which provides a structured, fine-grained analysis of IQA performance. GMC includes: (1) a \textbf{Granularity Modulator} that applies Gaussian-weighted correlations conditioned on absolute MOS values and pairwise MOS differences ($|Δ$MOS$|$) to examine local performance variations, and (2) a \textbf{Distribution Regulator} that regularizes correlations to mitigate biases from non-uniform quality distributions. The resulting \textbf{correlation surface} maps correlation values as a joint function of MOS and $|Δ$MOS$|$, providing a 3D representation of IQA performance. Experiments on standard benchmarks show that GMC reveals performance characteristics invisible to scalar metrics, offering a more informative and reliable paradigm for analyzing, comparing, and deploying IQA models. Codes are available at https://github.com/Dniaaa/GMC.
Related papers
- ReLE: A Scalable System and Structured Benchmark for Diagnosing Capability Anisotropy in Chinese LLMs [37.23311145049677]
We present ReLE, a scalable system designed to diagnose Capability Anisotropy.<n>We evaluate 304 models across a Domain $times$ Capability Symbolic matrix comprising 207,843 samples.
arXiv Detail & Related papers (2026-01-24T09:57:59Z) - MS-ISSM: Objective Quality Assessment of Point Clouds Using Multi-scale Implicit Structural Similarity [65.85858856481131]
unstructured and irregular nature of point clouds poses a significant challenge for objective quality assessment (PCQA)<n>We propose the Multi-scale Implicit Structural Similarity Measurement (MS-ISSM)
arXiv Detail & Related papers (2026-01-03T14:58:52Z) - Plug In, Grade Right: Psychology-Inspired AGIQA [60.23968344837525]
Existing AGIQA models estimate image quality by measuring and aggregating similarities between image embeddings and text embeddings.<n>We propose an improved Graded Response Model (GRM) for AGIQA.<n>Our Arithmetic GRM based Quality Grading (AGQG) module enjoys a plug-and-play advantage.
arXiv Detail & Related papers (2025-12-28T04:51:05Z) - Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization [53.82400605816587]
Action Quality Assessment (AQA) quantifies human actions in videos, supporting applications in sports scoring, rehabilitation, and skill evaluation.<n>A major challenge lies in the non-stationary nature of quality distributions in real-world scenarios.<n>We introduce Continual AQA (CAQA), which equips with Continual Learning capabilities to handle evolving distributions.
arXiv Detail & Related papers (2025-10-08T10:09:47Z) - Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment [3.6178660238507843]
evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available.<n>We introduce the Cumulative Consensus Score (CCS), a label-free metric that enables continuous monitoring and comparison of detectors in real-world settings.
arXiv Detail & Related papers (2025-09-16T09:24:37Z) - KG-EDAS: A Meta-Metric Framework for Evaluating Knowledge Graph Completion Models [0.0]
A major challenge in evaluating Knowledge Graphs (KGs) is comparing their performance across multiple datasets and metrics.<n>We propose KG Evaluation based on Distance from Average Solution (EDAS) to integrate multi-metric, multi-dataset performance into a unified ranking.<n>EDAS offers a global perspective that supports more informed model selection and promotes fairness in cross-dataset evaluation.
arXiv Detail & Related papers (2025-08-21T08:37:35Z) - Evaluating Knowledge Graph Complexity via Semantic, Spectral, and Structural Metrics for Link Prediction [0.0]
We introduce and benchmark a set of structural and semantic KG complexity metrics.<n>We find that CSG is highly sensitive to parametrisation and does not robustly scale with the number of classes.<n>Our results demonstrate that CSGs purported stability and generalization predictive power fail to hold in link prediction settings.
arXiv Detail & Related papers (2025-08-21T06:27:20Z) - GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for
No-reference Image Quality Assessment [40.33163764161929]
We construct a novel loss function and network to exploit Global-correlation and Mean-opinion Consistency.
We propose a novel GCC loss by defining a pairwise preference-based rank estimation to solve the non-differentiable problem of SROCC.
We also propose a mean-opinion network, which integrates diverse opinion features to alleviate the randomness of weight learning.
arXiv Detail & Related papers (2024-01-19T06:03:01Z) - Exposing and Addressing Cross-Task Inconsistency in Unified
Vision-Language Models [80.23791222509644]
Inconsistent AI models are considered brittle and untrustworthy by human users.
We find that state-of-the-art vision-language models suffer from a surprisingly high degree of inconsistent behavior across tasks.
We propose a rank correlation-based auxiliary training objective, computed over large automatically created cross-task contrast sets.
arXiv Detail & Related papers (2023-03-28T16:57:12Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Feature Quantization Improves GAN Training [126.02828112121874]
Feature Quantization (FQ) for the discriminator embeds both true and fake data samples into a shared discrete space.
Our method can be easily plugged into existing GAN models, with little computational overhead in training.
arXiv Detail & Related papers (2020-04-05T04:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.