On the Within-Group Fairness of Screening Classifiers
- URL: http://arxiv.org/abs/2302.00025v2
- Date: Mon, 7 Aug 2023 12:06:43 GMT
- Title: On the Within-Group Fairness of Screening Classifiers
- Authors: Nastaran Okati, Stratis Tsirtsis and Manuel Gomez Rodriguez
- Abstract summary: We argue that screening policies that use calibrated classifiers may suffer from an understudied type of within-group unfairness.
We show that within-group monotonicity can be achieved at a small cost in terms of prediction granularity and shortlist size.
- Score: 16.404065044314976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Screening classifiers are increasingly used to identify qualified candidates
in a variety of selection processes. In this context, it has been recently
shown that, if a classifier is calibrated, one can identify the smallest set of
candidates which contains, in expectation, a desired number of qualified
candidates using a threshold decision rule. This lends support to focusing on
calibration as the only requirement for screening classifiers. In this paper,
we argue that screening policies that use calibrated classifiers may suffer
from an understudied type of within-group unfairness -- they may unfairly treat
qualified members within demographic groups of interest. Further, we argue that
this type of unfairness can be avoided if classifiers satisfy within-group
monotonicity, a natural monotonicity property within each of the groups. Then,
we introduce an efficient post-processing algorithm based on dynamic
programming to minimally modify a given calibrated classifier so that its
probability estimates satisfy within-group monotonicity. We validate our
algorithm using US Census survey data and show that within-group monotonicity
can be often achieved at a small cost in terms of prediction granularity and
shortlist size.
Related papers
- Mitigating Word Bias in Zero-shot Prompt-based Classifiers [55.60306377044225]
We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
arXiv Detail & Related papers (2023-09-10T10:57:41Z) - Class-Conditional Conformal Prediction with Many Classes [60.8189977620604]
We propose a method called clustered conformal prediction that clusters together classes having "similar" conformal scores.
We find that clustered conformal typically outperforms existing methods in terms of class-conditional coverage and set size metrics.
arXiv Detail & Related papers (2023-06-15T17:59:02Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Fairness and Unfairness in Binary and Multiclass Classification: Quantifying, Calculating, and Bounding [22.449347663780767]
We propose a new interpretable measure of unfairness, that allows providing a quantitative analysis of classifier fairness.
We show how this measure can be calculated when the classifier's conditional confusion matrices are known.
We report experiments on data sets representing diverse applications.
arXiv Detail & Related papers (2022-06-07T12:26:28Z) - Improving Screening Processes via Calibrated Subset Selection [35.952153033163576]
We develop a distribution-free screening algorithm called Calibrated Subset Selection (CSS)
CSS finds near-optimal shortlists of candidates that contain a desired number of qualified candidates in expectation.
Experiments on US Census survey data validate our theoretical results and show that the shortlists provided by our algorithm are superior to those provided by several competitive baselines.
arXiv Detail & Related papers (2022-02-02T17:15:44Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Fairness with Overlapping Groups [15.154984899546333]
A standard goal is to ensure the equality of fairness metrics across multiple overlapping groups simultaneously.
We reconsider this standard fair classification problem using a probabilistic population analysis.
Our approach unifies a variety of existing group-fair classification methods and enables extensions to a wide range of non-decomposable multiclass performance metrics and fairness measures.
arXiv Detail & Related papers (2020-06-24T05:01:10Z) - Quantifying the Uncertainty of Precision Estimates for Rule based Text
Classifiers [0.0]
Rule based classifiers that use the presence and absence of key sub-strings to make classification decisions have a natural mechanism for quantifying the uncertainty of their precision.
For a binary classifier, the key insight is to treat partitions of the sub-string set induced by the documents as Bernoulli random variables.
The utility of this approach is demonstrated with a benchmark problem.
arXiv Detail & Related papers (2020-05-19T03:51:47Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z) - Better Multi-class Probability Estimates for Small Data Sets [0.0]
We show that Data Generation and Grouping algorithm can be used to solve multi-class problems.
Our experiments show that calibration error can be decreased using the proposed approach and the additional computational cost is acceptable.
arXiv Detail & Related papers (2020-01-30T10:21:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.