Related papers: Auditability and the Landscape of Distance to Multicalibration

Auditability and the Landscape of Distance to Multicalibration

URL: http://arxiv.org/abs/2509.16930v1
Date: Sun, 21 Sep 2025 05:26:48 GMT
Title: Auditability and the Landscape of Distance to Multicalibration
Authors: Nathan Derhake, Siddartha Devic, Dutch Hansen, Kuan Liu, Vatsal Sharan,
Abstract summary: We consider the auditability of the distance to multicalibration of a predictor $f$.<n>Our findings may have implications for the development of stronger multicalibration algorithms.
Score: 14.895808794051568
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Calibration is a critical property for establishing the trustworthiness of predictors that provide uncertainty estimates. Multicalibration is a strengthening of calibration which requires that predictors be calibrated on a potentially overlapping collection of subsets of the domain. As multicalibration grows in popularity with practitioners, an essential question is: how do we measure how multicalibrated a predictor is? B{\l}asiok et al. (2023) considered this question for standard calibration by introducing the distance to calibration framework (dCE) to understand how calibration metrics relate to each other and the ground truth. Building on the dCE framework, we consider the auditability of the distance to multicalibration of a predictor $f$. We begin by considering two natural generalizations of dCE to multiple subgroups: worst group dCE (wdMC), and distance to multicalibration (dMC). We argue that there are two essential properties of any multicalibration error metric: 1) the metric should capture how much $f$ would need to be modified in order to be perfectly multicalibrated; and 2) the metric should be auditable in an information theoretic sense. We show that wdMC and dMC each fail to satisfy one of these two properties, and that similar barriers arise when considering the auditability of general distance to multigroup fairness notions. We then propose two (equivalent) multicalibration metrics which do satisfy these requirements: 1) a continuized variant of dMC; and 2) a distance to intersection multicalibration, which leans on intersectional fairness desiderata. Along the way, we shed light on the loss-landscape of distance to multicalibration and the geometry of the set of perfectly multicalibrated predictors. Our findings may have implications for the development of stronger multicalibration algorithms as well as multigroup auditing more generally.

Related papers

Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning [53.9713678229744]
Multi-instance partial-label learning (MIPL) is a weakly supervised framework that addresses the challenges of inexact supervision in both instance and label spaces.<n>Existing MIPL approaches often suffer from poor calibration, undermining reliability.<n>We propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance.
arXiv Detail & Related papers (2025-12-19T16:58:31Z)
Scalable Utility-Aware Multiclass Calibration [53.28176049547449]
Utility calibration is a general framework that measures the calibration error relative to a specific utility function.<n>We demonstrate how this framework can unify and re-interpret several existing calibration metrics.
arXiv Detail & Related papers (2025-10-29T12:32:14Z)
Adaptive Set-Mass Calibration with Conformal Prediction [60.47079469141295]
We develop a new calibration procedure that starts with conformal prediction to obtain a set of labels that gives the desired coverage.<n>We then instantiate two simple post-hoc calibrators: a mass normalization and a temperature scaling-based rule, tuned to the conformal constraint.
arXiv Detail & Related papers (2025-05-21T12:18:15Z)
When is Multicalibration Post-Processing Necessary? [12.628103786954487]
Multicalibration is a property of predictors which guarantees meaningful uncertainty estimates. We conduct the first comprehensive study evaluating the usefulness of multicalibration post-processing. We distill many independent observations which may be useful for practical and effective applications of multicalibration post-processing.
arXiv Detail & Related papers (2024-06-10T17:26:39Z)
A Unifying Perspective on Multi-Calibration: Game Dynamics for Multi-Objective Learning [63.20009081099896]
We provide a unifying framework for the design and analysis of multicalibrated predictors. We exploit connections to game dynamics to achieve state-of-the-art guarantees for a diverse set of multicalibration learning problems.
arXiv Detail & Related papers (2023-02-21T18:24:17Z)
Fair admission risk prediction with proportional multicalibration [0.16249424686052708]
Multicalibration constrains calibration error among flexibly-defined subpopulations. It is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. We propose proportional multicalibration, a criteria that constrains the percent calibration error among groups and within prediction bins.
arXiv Detail & Related papers (2022-09-29T08:15:29Z)
Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.<n>In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)<n>For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z)
Low-Degree Multicalibration [16.99099840073075]
Low-Degree Multicalibration defines a hierarchy of increasingly-powerful multi-group fairness notions. We show that low-degree multicalibration can be significantly more efficient than full multicalibration. Our work presents compelling evidence that low-degree multicalibration represents a sweet spot, pairing computational and sample efficiency with strong fairness and accuracy guarantees.
arXiv Detail & Related papers (2022-03-02T17:24:55Z)
Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration. We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z)
Sample Complexity of Uniform Convergence for Multicalibration [43.10452387619829]
We address the multicalibration error and decouple it from the prediction error. Our work gives sample complexity bounds for uniform convergence guarantees of multicalibration error.
arXiv Detail & Related papers (2020-05-04T18:01:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.