On Learning and Enforcing Latent Assessment Models using Binary Feedback
from Human Auditors Regarding Black-Box Classifiers
- URL: http://arxiv.org/abs/2202.08250v1
- Date: Wed, 16 Feb 2022 18:54:32 GMT
- Title: On Learning and Enforcing Latent Assessment Models using Binary Feedback
from Human Auditors Regarding Black-Box Classifiers
- Authors: Mukund Telukunta, Venkata Sriram Siddhardh Nadendla
- Abstract summary: We propose a novel model called latent assessment model (LAM) to characterize binary feedback provided by human auditors.
We prove that individual and group fairness notions are guaranteed as long as the auditor's intrinsic judgments inherently satisfy the fairness notion.
We also demonstrate this relationship between LAM and traditional fairness notions on three well-known datasets.
- Score: 1.116812194101501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithmic fairness literature presents numerous mathematical notions and
metrics, and also points to a tradeoff between them while satisficing some or
all of them simultaneously. Furthermore, the contextual nature of fairness
notions makes it difficult to automate bias evaluation in diverse algorithmic
systems. Therefore, in this paper, we propose a novel model called latent
assessment model (LAM) to characterize binary feedback provided by human
auditors, by assuming that the auditor compares the classifier's output to his
or her own intrinsic judgment for each input. We prove that individual and
group fairness notions are guaranteed as long as the auditor's intrinsic
judgments inherently satisfy the fairness notion at hand, and are relatively
similar to the classifier's evaluations. We also demonstrate this relationship
between LAM and traditional fairness notions on three well-known datasets,
namely COMPAS, German credit and Adult Census Income datasets. Furthermore, we
also derive the minimum number of feedback samples needed to obtain PAC
learning guarantees to estimate LAM for black-box classifiers. These guarantees
are also validated via training standard machine learning algorithms on real
binary feedback elicited from 400 human auditors regarding COMPAS.
Related papers
- Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets [69.91340332545094]
We introduce FLASK, a fine-grained evaluation protocol for both human-based and model-based evaluation.
We experimentally observe that the fine-graininess of evaluation is crucial for attaining a holistic view of model performance.
arXiv Detail & Related papers (2023-07-20T14:56:35Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Fairness Evaluation in Text Classification: Machine Learning
Practitioner Perspectives of Individual and Group Fairness [34.071324739205096]
We run a study with Machine Learning practitioners to understand the strategies used to evaluate models.
We discover fairness assessment strategies involving personal experiences or how users form groups of identity tokens to test model fairness.
arXiv Detail & Related papers (2023-03-01T17:12:49Z) - Utilizing supervised models to infer consensus labels and their quality
from data with multiple annotators [16.79939549201032]
Real-world data for classification is often labeled by multiple annotators.
We introduce CROWDLAB, a straightforward approach to estimate such data.
Our proposed method provides superior estimates for (1)- (3) than many alternative algorithms.
arXiv Detail & Related papers (2022-10-13T07:54:07Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Non-Comparative Fairness for Human-Auditing and Its Relation to
Traditional Fairness Notions [1.8275108630751837]
This paper proposes a new fairness notion based on the principle of non-comparative justice.
We show that any MLS can be deemed fair from the perspective of comparative fairness.
We also show that the converse holds true in the context of individual fairness.
arXiv Detail & Related papers (2021-06-29T20:05:22Z) - Visualizing Classifier Adjacency Relations: A Case Study in Speaker
Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z) - Fairness by Explicability and Adversarial SHAP Learning [0.0]
We propose a new definition of fairness that emphasises the role of an external auditor and model explicability.
We develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model.
We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset.
arXiv Detail & Related papers (2020-03-11T14:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.