Related papers: Revisiting Reweighted Risk for Calibration: AURC, Focal Loss, and Inverse Focal Loss

Revisiting Reweighted Risk for Calibration: AURC, Focal Loss, and Inverse Focal Loss

URL: http://arxiv.org/abs/2505.23463v2
Date: Tue, 10 Jun 2025 15:31:48 GMT
Title: Revisiting Reweighted Risk for Calibration: AURC, Focal Loss, and Inverse Focal Loss
Authors: Han Zhou, Sebastian G. Gruber, Teodora Popordanoska, Matthew B. Blaschko,
Abstract summary: In this paper, we revisit a broad class of weighted risk functions commonly used in deep learning.<n>We establish a principled connection between these reweighting schemes and calibration errors.<n>We show that optimizing a regularized variant of the AURC naturally leads to improved calibration.
Score: 17.970895970580823
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Several variants of reweighted risk functionals, such as focal losss, inverse focal loss, and the Area Under the Risk-Coverage Curve (AURC), have been proposed in the literature and claims have been made in relation to their calibration properties. However, focal loss and inverse focal loss propose vastly different weighting schemes. In this paper, we revisit a broad class of weighted risk functions commonly used in deep learning and establish a principled connection between these reweighting schemes and calibration errors. We show that minimizing calibration error is closely linked to the selective classification paradigm and demonstrate that optimizing a regularized variant of the AURC naturally leads to improved calibration. This regularized AURC shares a similar reweighting strategy with inverse focal loss, lending support to the idea that focal loss is less principled when calibration is a desired outcome. Direct AURC optimization offers greater flexibility through the choice of confidence score functions (CSFs). To enable gradient-based optimization, we introduce a differentiable formulation of the regularized AURC using the SoftRank technique. Empirical evaluations demonstrate that our AURC-based loss achieves competitive class-wise calibration performance across a range of datasets and model architectures.

Related papers

SGIC: A Self-Guided Iterative Calibration Framework for RAG [45.17496149653415]
Large language models (LLMs) capitalize on their robust in-context reasoning.<n>We present a new framework that employs uncertainty scores as a tool.<n>We also introduce an innovative approach for constructing an iterative self-calibration training set.
arXiv Detail & Related papers (2025-06-19T09:45:13Z)
Know What You Don't Know: Uncertainty Calibration of Process Reward Models [8.958124143194512]
Even state-of-the-art PRMs can be poorly calibrated and often overestimate success probabilities.<n>We present a calibration approach, performed via quantile regression, that PRM outputs to better align with true success probabilities.
arXiv Detail & Related papers (2025-06-11T02:39:26Z)
CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment [7.702016079410588]
We introduce CLUE (Calibration via Learning Uncertainty-Error Alignment), a novel approach that aligns predicted uncertainty with observed error during training.<n>We show that CLUE achieves superior calibration quality and competitive predictive performance with respect to state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-28T19:23:47Z)
Calibration Strategies for Robust Causal Estimation: Theoretical and Empirical Insights on Propensity Score-Based Estimators [0.6562256987706128]
partitioning of data for estimation and calibration critically impacts the performance of propensity score based estimators.<n>We extend recent advances in calibration techniques for propensity score estimation, improving the robustness of propensity scores in challenging settings.
arXiv Detail & Related papers (2025-03-21T16:41:10Z)
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z)
Prior Constraints-based Reward Model Training for Aligning Large Language Models [58.33118716810208]
This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem. PCRM incorporates prior constraints, specifically, length ratio and cosine similarity between outputs of each comparison pair, during reward model training to regulate optimization magnitude and control score margins. Experimental results demonstrate that PCRM significantly improves alignment performance by effectively constraining reward score scaling.
arXiv Detail & Related papers (2024-04-01T07:49:11Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk. We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations. We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z)
Learning convex regularizers satisfying the variational source condition for inverse problems [4.2917182054051]
We propose adversarial convex regularization (ACR) to learn data-driven convex regularizers via adversarial training. We leverage the variational source condition (SC) during training to enforce that the ground-truth images minimize the variational loss corresponding to the learned convex regularizer. The resulting regularizer (ACR-SC) performs on par with the ACR, but unlike ACR, comes with a quantitative convergence rate estimate.
arXiv Detail & Related papers (2021-10-24T20:05:59Z)
Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems. We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.