PAC-Bayes Analysis for Recalibration in Classification
- URL: http://arxiv.org/abs/2406.06227v2
- Date: Fri, 11 Jul 2025 01:22:08 GMT
- Title: PAC-Bayes Analysis for Recalibration in Classification
- Authors: Masahiro Fujisawa, Futoshi Futami,
- Abstract summary: We conduct a generalization analysis of calibration error using the Bayes framework.<n>On the basis of our theory, we propose a generalization-aware recalibration algorithm.
- Score: 4.005483185111992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary classification, creating a significant gap with practical applications such as multiclass classification. Additionally, many parametric recalibration algorithms lack theoretical guarantees for their generalization performance. To address these issues, we conduct a generalization analysis of calibration error using the probably approximately correct Bayes framework. This approach enables us to derive the first optimizable upper bound for generalization error in the calibration context. On the basis of our theory, we propose a generalization-aware recalibration algorithm. Numerical experiments show that our algorithm enhances the performance of Gaussian process-based recalibration across various benchmark datasets and models.
Related papers
- Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis [44.468416523840965]
We develop a theory of algorithm-dependent generalisation for high-dimensional diffusion models.<n>We derive generalisation bounds in terms of score stability, and apply our framework to several fundamental learning settings.
arXiv Detail & Related papers (2025-07-04T18:07:06Z) - h-calibration: Rethinking Classifier Recalibration with Probabilistic Error-Bounded Objective [12.903217487071172]
Deep neural networks have demonstrated remarkable performance across numerous learning tasks but often suffer from miscalibration.<n>This has inspired many recent works on mitigating miscalibration, particularly through post-hoc recalibration methods.<n>We propose a probabilistic learning framework for calibration called h-calibration, which theoretically constructs an equivalent learning formulation for canonical calibration with boundedness.<n>Our method not only overcomes the ten identified limitations but also achieves markedly better performance than traditional methods.
arXiv Detail & Related papers (2025-06-22T09:56:44Z) - Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z) - Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Optimizing Estimators of Squared Calibration Errors in Classification [2.3020018305241337]
We propose a mean-squared error-based risk that enables the comparison and optimization of estimators of squared calibration errors.<n>Our approach advocates for a training-validation-testing pipeline when estimating a calibration error.
arXiv Detail & Related papers (2024-10-09T15:58:06Z) - Estimating Generalization Performance Along the Trajectory of Proximal SGD in Robust Regression [4.150180443030652]
We introduce estimators that precisely track the generalization error of the iterates along the trajectory of the iterative algorithm.
The results are illustrated through several examples, including Huber regression, pseudo-Huber regression, and their penalized variants with non-smooth regularizer.
arXiv Detail & Related papers (2024-10-03T16:13:42Z) - A naive aggregation algorithm for improving generalization in a class of learning problems [0.0]
We present a naive aggregation algorithm for a typical learning problem with expert advice setting.
In particular, we consider a class of learning problem of point estimations for modeling high-dimensional nonlinear functions.
arXiv Detail & Related papers (2024-09-06T15:34:17Z) - Orthogonal Causal Calibration [55.28164682911196]
We prove generic upper bounds on the calibration error of any causal parameter estimate $theta$ with respect to any loss $ell$.
We use our bound to analyze the convergence of two sample splitting algorithms for causal calibration.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Information-theoretic Generalization Analysis for Expected Calibration Error [4.005483185111992]
We present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning.
Our bounds reveal, for the first time, the optimal number of bins to minimize the estimation bias.
We extend our bias analysis to generalization error analysis based on the information-theoretic approach.
arXiv Detail & Related papers (2024-05-24T16:59:29Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression.
This framework allows one to transform any regression model into a calibrated probabilistic model.
We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z) - Towards Data-Algorithm Dependent Generalization: a Case Study on
Overparameterized Linear Regression [19.047997113063147]
We introduce a notion called data-algorithm compatibility, which considers the generalization behavior of the entire data-dependent training trajectory.
We perform a data-dependent trajectory analysis and derive a sufficient condition for compatibility in such a setting.
arXiv Detail & Related papers (2022-02-12T12:42:36Z) - Heterogeneous Calibration: A post-hoc model-agnostic framework for
improved generalization [8.815439276597818]
We introduce the notion of heterogeneous calibration that applies a post-hoc model-agnostic transformation to model outputs for improving AUC performance on binary classification tasks.
We refer to simple patterns as heterogeneous partitions of the feature space and show theoretically that perfectly calibrating each partition separately optimize AUC.
While the theoretical optimality of this framework holds for any model, we focus on deep neural networks (DNNs) and test the simplest instantiation of this paradigm on a variety of open-source datasets.
arXiv Detail & Related papers (2022-02-10T05:08:50Z) - Learning Prediction Intervals for Regression: Generalization and
Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification.
We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes.
We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.