Related papers: On the Normalization of Confusion Matrices: Methods and Geometric Interpretations

On the Normalization of Confusion Matrices: Methods and Geometric Interpretations

URL: http://arxiv.org/abs/2509.04959v1
Date: Fri, 05 Sep 2025 09:36:51 GMT
Title: On the Normalization of Confusion Matrices: Methods and Geometric Interpretations
Authors: Johan Erbani, Pierre-Edouard Portier, Elod Egyed-Zsigmond, Sonia Ben Mokhtar, Diana Nurbakova,
Abstract summary: We introduce bistochastic normalization using Iterative Proportional Fitting.<n>Unlike standard normalizations, this method recovers the underlying structure of class similarity.<n>We show a correspondence between confusion matrix normalizations and the model's internal class representations.
Score: 2.4097006540200434
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The confusion matrix is a standard tool for evaluating classifiers by providing insights into class-level errors. In heterogeneous settings, its values are shaped by two main factors: class similarity -- how easily the model confuses two classes -- and distribution bias, arising from skewed distributions in the training and test sets. However, confusion matrix values reflect a mix of both factors, making it difficult to disentangle their individual contributions. To address this, we introduce bistochastic normalization using Iterative Proportional Fitting, a generalization of row and column normalization. Unlike standard normalizations, this method recovers the underlying structure of class similarity. By disentangling error sources, it enables more accurate diagnosis of model behavior and supports more targeted improvements. We also show a correspondence between confusion matrix normalizations and the model's internal class representations. Both standard and bistochastic normalizations can be interpreted geometrically in this space, offering a deeper understanding of what normalization reveals about a classifier.

Related papers

Generative Classifiers Avoid Shortcut Solutions [84.23247217037134]
Discriminative approaches to classification often learn shortcuts that hold in-distribution but fail under minor distribution shift.<n>We show that generative classifiers can avoid this issue by modeling all features, both core and spurious, instead of mainly spurious ones.<n>We find that diffusion-based and autorerimigressive generative classifiers achieve state-of-the-art performance on five standard image and text distribution shift benchmarks.
arXiv Detail & Related papers (2025-12-31T18:31:46Z)
Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification [75.01389991485098]
We show that ridge regression has qualitatively different behavior depending on the scale of the cluster mean vector.<n>In regimes where the scale is very large, the conditions that allow for benign overfitting turn out to be the same as those for the regression task.
arXiv Detail & Related papers (2025-03-11T01:45:42Z)
Obtaining Explainable Classification Models using Distributionally Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules. An inherent trade-off exists between rule set sparsity and its prediction accuracy. We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z)
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks [117.93273337740442]
We show that gradient descent converges to a uniform margin classifier on the training data with an $exp(-Omega(log2 t))$ convergence rate. We also show that batch normalization has an implicit bias towards a patch-wise uniform margin.
arXiv Detail & Related papers (2023-06-20T16:58:00Z)
Malign Overfitting: Interpolation Can Provably Preclude Invariance [30.776243638012314]
We show that "benign overfitting" in which models generalize well despite interpolating might not favorably extend to settings in which robustness or fairness are desirable. We propose and analyze an algorithm that successfully learns a non-interpolating classifier that is provably invariant.
arXiv Detail & Related papers (2022-11-28T19:17:31Z)
Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints. The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution. We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data. Invariance measures consistency of model predictions on transformations of the data. From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z)
A Normative Model of Classifier Fusion [4.111899441919164]
We present a hierarchical Bayesian model of probabilistic classification fusion based on a new correlated Dirichlet distribution. The proposed model naturally accommodates the classic Independent Opinion Pool and other independent fusion algorithms as special cases. It is evaluated by uncertainty reduction and correctness of fusion on synthetic and real-world data sets.
arXiv Detail & Related papers (2021-06-03T11:52:13Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.