Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency
- URL: http://arxiv.org/abs/2602.07285v1
- Date: Sat, 07 Feb 2026 00:26:40 GMT
- Title: Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency
- Authors: Etam Benger, Katrina Ligett,
- Abstract summary: Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning.<n>We present an exact solution for optimal binary classification under sufficiency, assuming finite sets of group-calibrated scores.
- Score: 2.0686600920324163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates statistical group fairness constraints. Under independence (statistical parity) and separation (equalized odds), such thresholding suffices when the scores already satisfy the corresponding criterion. However, this does not extend to sufficiency: even perfectly group-calibrated scores -- including true class probabilities -- violate predictive parity after thresholding. In this work, we present an exact solution for optimal binary (randomized) classification under sufficiency, assuming finite sets of group-calibrated scores. We provide a geometric characterization of the feasible pairs of positive predictive value (PPV) and false omission rate (FOR) achievable by such classifiers, and use it to derive a simple post-processing algorithm that attains the optimal classifier using only group-calibrated scores and group membership. Finally, since sufficiency and separation are generally incompatible, we identify the classifier that minimizes deviation from separation subject to sufficiency, and show that it can also be obtained by our algorithm, often achieving performance comparable to the optimum.
Related papers
- Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z) - Principled Algorithms for Optimizing Generalized Metrics in Binary Classification [53.604375124674796]
We introduce principled algorithms for optimizing generalized metrics, supported by $H$-consistency and finite-sample generalization bounds.<n>Our approach reformulates metric optimization as a generalized cost-sensitive learning problem.<n>We develop new algorithms, METRO, with strong theoretical performance guarantees.
arXiv Detail & Related papers (2025-12-29T01:33:42Z) - Accuracy vs. Accuracy: Computational Tradeoffs Between Classification Rates and Utility [6.99674326582747]
We revisit the foundations of fairness and its interplay with utility and efficiency in settings where the training data contain richer labels.<n>We propose algorithms that achieve stronger notions of evidence-based fairness than are possible in standard supervised learning.
arXiv Detail & Related papers (2025-05-22T10:26:30Z) - A Unified Post-Processing Framework for Group Fairness in Classification [10.615965454674901]
We present a post-processing algorithm for fair classification that covers group fairness criteria including statistical parity, equal opportunity, and equalized odds under a single framework.<n>Our algorithm, called "LinearPost", achieves fairness post-hoc by linearly transforming the predictions of the (unfair) base predictor with a "fairness risk" according to a weighted combination of the (predicted) group memberships.
arXiv Detail & Related papers (2024-05-07T05:58:44Z) - Mitigating Word Bias in Zero-shot Prompt-based Classifiers [55.60306377044225]
We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
arXiv Detail & Related papers (2023-09-10T10:57:41Z) - Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking.
xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics.
We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Classification with abstention but without disparities [5.025654873456756]
We build a general purpose classification algorithm, which is able to abstain from prediction, while avoiding disparate impact.
We establish finite sample risk, fairness, and abstention guarantees for the proposed algorithm.
Our method empirically shows that moderate abstention rates allow to bypass the risk-fairness trade-off.
arXiv Detail & Related papers (2021-02-24T12:43:55Z) - Fairness with Overlapping Groups [15.154984899546333]
A standard goal is to ensure the equality of fairness metrics across multiple overlapping groups simultaneously.
We reconsider this standard fair classification problem using a probabilistic population analysis.
Our approach unifies a variety of existing group-fair classification methods and enables extensions to a wide range of non-decomposable multiclass performance metrics and fairness measures.
arXiv Detail & Related papers (2020-06-24T05:01:10Z) - To Split or Not to Split: The Impact of Disparate Treatment in
Classification [8.325775867295814]
Disparate treatment occurs when a machine learning model yields different decisions for individuals based on a sensitive attribute.
We introduce the benefit-of-splitting for quantifying the performance improvement by splitting classifiers.
We prove an equivalent expression for the benefit-of-splitting which can be efficiently computed by solving small-scale convex programs.
arXiv Detail & Related papers (2020-02-12T04:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.