OxEnsemble: Fair Ensembles for Low-Data Classification
- URL: http://arxiv.org/abs/2512.09665v1
- Date: Wed, 10 Dec 2025 14:08:44 GMT
- Title: OxEnsemble: Fair Ensembles for Low-Data Classification
- Authors: Jonathan Rystrøm, Zihao Fu, Chris Russell,
- Abstract summary: We propose a novel approach for efficiently training ensembles and enforcing fairness in low-data regimes.<n>By construction, emphOxEnsemble is both data-efficient, carefully reusing held-out data to enforce fairness reliably, and compute-efficient.
- Score: 9.81842989622959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of fair classification in settings where data is scarce and unbalanced across demographic groups. Such low-data regimes are common in domains like medical imaging, where false negatives can have fatal consequences. We propose a novel approach \emph{OxEnsemble} for efficiently training ensembles and enforcing fairness in these low-data regimes. Unlike other approaches, we aggregate predictions across ensemble members, each trained to satisfy fairness constraints. By construction, \emph{OxEnsemble} is both data-efficient, carefully reusing held-out data to enforce fairness reliably, and compute-efficient, requiring little more compute than used to fine-tune or evaluate an existing model. We validate this approach with new theoretical guarantees. Experimentally, our approach yields more consistent outcomes and stronger fairness-accuracy trade-offs than existing methods across multiple challenging medical imaging classification datasets.
Related papers
- Demographic-Agnostic Fairness without Harm [15.171262544838337]
Machine learning algorithms are increasingly used in social domains to make predictions about humans.<n>There is a growing concern that these algorithms may exhibit biases against certain social groups.<n>We propose a novel textitdemographic-agnostic fairness without harm optimization algorithm.
arXiv Detail & Related papers (2025-09-28T21:23:32Z) - Fairness for the People, by the People: Minority Collective Action [50.29077265863936]
Machine learning models often preserve biases present in training data, leading to unfair treatment of certain minority groups.<n>We propose a coordinated minority group strategically relabels its own data to enhance fairness, without altering the firm's training process.<n>Our findings show that a subgroup of the minority can substantially reduce unfairness with a small impact on the overall prediction error.
arXiv Detail & Related papers (2025-08-21T09:09:39Z) - Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.<n>We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z) - Achievable Fairness on Your Data With Utility Guarantees [16.78730663293352]
In machine learning fairness, training models that minimize disparity across different sensitive groups often leads to diminished accuracy.
We present a computationally efficient approach to approximate the fairness-accuracy trade-off curve tailored to individual datasets.
We introduce a novel methodology for quantifying uncertainty in our estimates, thereby providing practitioners with a robust framework for auditing model fairness.
arXiv Detail & Related papers (2024-02-27T00:59:32Z) - Fairness Without Harm: An Influence-Guided Active Sampling Approach [32.173195437797766]
We aim to train models that mitigate group fairness disparity without causing harm to model accuracy.
The current data acquisition methods, such as fair active learning approaches, typically require annotating sensitive attributes.
We propose a tractable active data sampling algorithm that does not rely on training group annotations.
arXiv Detail & Related papers (2024-02-20T07:57:38Z) - Fair Active Learning in Low-Data Regimes [22.349886628823125]
In machine learning applications, ensuring fairness is essential to avoid perpetuating social inequities.
In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments.
We introduce an innovative active learning framework that combines an exploration procedure inspired by posterior sampling with a fair classification subroutine.
We demonstrate that this framework performs effectively in very data-scarce regimes, maximizing accuracy while satisfying fairness constraints with high probability.
arXiv Detail & Related papers (2023-12-13T23:14:55Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - On Comparing Fair Classifiers under Data Bias [42.43344286660331]
We study the effect of varying data biases on the accuracy and fairness of fair classifiers.
Our experiments show how to integrate a measure of data bias risk in the existing fairness dashboards for real-world deployments.
arXiv Detail & Related papers (2023-02-12T13:04:46Z) - Learning from Similarity-Confidence Data [94.94650350944377]
We investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data.
We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate.
arXiv Detail & Related papers (2021-02-13T07:31:16Z) - Fair Densities via Boosting the Sufficient Statistics of Exponential
Families [72.34223801798422]
We introduce a boosting algorithm to pre-process data for fairness.
Our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee.
Empirical results are present to display the quality of result on real-world data.
arXiv Detail & Related papers (2020-12-01T00:49:17Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Fast Fair Regression via Efficient Approximations of Mutual Information [0.0]
This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models.
It uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework.
Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs.
arXiv Detail & Related papers (2020-02-14T08:50:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.