Related papers: Decomposing Global AUC into Cluster-Level Contributions for Localized Model Diagnostics

Decomposing Global AUC into Cluster-Level Contributions for Localized Model Diagnostics

URL: http://arxiv.org/abs/2508.07495v1
Date: Sun, 10 Aug 2025 21:58:47 GMT
Title: Decomposing Global AUC into Cluster-Level Contributions for Localized Model Diagnostics
Authors: Agus Sudjianto, Alice J. Liu,
Abstract summary: Area Under the ROC Curve (AUC) is a widely used performance metric for binary classifiers.<n>In high-stakes applications such as credit approval and fraud detection, these weaknesses can lead to financial risk or operational failures.<n>We introduce a formal decomposition of global AUC into intra- and inter-cluster components.
Score: 1.104960878651584
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Area Under the ROC Curve (AUC) is a widely used performance metric for binary classifiers. However, as a global ranking statistic, the AUC aggregates model behavior over the entire dataset, masking localized weaknesses in specific subpopulations. In high-stakes applications such as credit approval and fraud detection, these weaknesses can lead to financial risk or operational failures. In this paper, we introduce a formal decomposition of global AUC into intra- and inter-cluster components. This allows practitioners to evaluate classifier performance within and across clusters of data, enabling granular diagnostics and subgroup analysis. We also compare the AUC with additive performance metrics such as the Brier score and log loss, which support decomposability and direct attribution. Our framework enhances model development and validation practice by providing additional insights to detect model weakness for model risk management.

Related papers

CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
SubROC: AUC-Based Discovery of Exceptional Subgroup Performance for Binary Classifiers [1.533848041901807]
SubROC is a framework based on Model Mining for reliably and efficiently finding strengths and weaknesses of classification models.<n>It incorporates common evaluation measures (ROC and PR AUC), efficient search space pruning for fast exhaustive subgroup search, control for class imbalance, adjustment for redundant patterns, and significance testing.
arXiv Detail & Related papers (2025-05-16T14:18:40Z)
Generative Classifier for Domain Generalization [84.92088101715116]
Domain generalization aims to the generalizability of computer vision models toward distribution shifts.<n>We propose Generative-driven Domain Generalization (GCDG)<n>GCDG consists of three key modules: Heterogeneity Learning(HLC), Spurious Correlation(SCB), and Diverse Component Balancing(DCB)
arXiv Detail & Related papers (2025-04-03T04:38:33Z)
Combating Financial Crimes with Unsupervised Learning Techniques: Clustering and Dimensionality Reduction for Anti-Money Laundering [0.0]
Anti-Money Laundering (AML) is a crucial task in ensuring the integrity of financial systems. Unsupervised learning, particularly clustering, is a promising solution for this task. In this paper, we investigate the effectiveness of combining clustering method agglomerative hierarchicalclustering with four dimensionality reduction techniques.
arXiv Detail & Related papers (2024-02-14T17:31:29Z)
Consistency Regularization for Generalizable Source-free Domain Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset. Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets. We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z)
GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models. We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench. GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z)
Clustering Validation with The Area Under Precision-Recall Curves [0.0]
Clustering Validation Index (CVI) allows for clustering validation in real application scenarios. We show that these are not only appropriate as CVIs, but should also be preferred in the presence of cluster imbalance. We perform a comprehensive evaluation of proposed and state-of-art CVIs on real and simulated data sets.
arXiv Detail & Related papers (2023-04-04T01:49:57Z)
On Certifying and Improving Generalization to Unseen Domains [87.00662852876177]
Domain Generalization aims to learn models whose performance remains high on unseen domains encountered at test-time. It is challenging to evaluate DG algorithms comprehensively using a few benchmark datasets. We propose a universal certification framework that can efficiently certify the worst-case performance of any DG method.
arXiv Detail & Related papers (2022-06-24T16:29:43Z)
Multi-class Classification Based Anomaly Detection of Insider Activities [18.739091829480234]
We propose an approach that combines generative model with supervised learning to perform multi-class classification using deep learning. The generative adversarial network (GAN) based insider detection model introduces Conditional Generative Adversarial Network (CGAN) to enrich minority class samples. The comprehensive experiments performed on the benchmark dataset demonstrates the effectiveness of introducing GAN derived synthetic data.
arXiv Detail & Related papers (2021-02-15T00:08:39Z)
Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain. We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one. Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z)
The Area Under the ROC Curve as a Measure of Clustering Quality [0.0]
Area Under the Curve for Clustering (AUCC) is an internal/relative measure of clustering quality. AUCC is a linear transformation of the Gamma criterion from Baker and Hubert (1975).
arXiv Detail & Related papers (2020-09-04T21:34:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.