Estimating Structural Disparities for Face Models
- URL: http://arxiv.org/abs/2204.06562v1
- Date: Wed, 13 Apr 2022 05:30:53 GMT
- Title: Estimating Structural Disparities for Face Models
- Authors: Shervin Ardeshir, Cristina Segalin, Nathan Kallus
- Abstract summary: In machine learning, disparity metrics are often defined by measuring the difference in the performance or outcome of a model, across different sub-populations.
We explore performing such analysis on computer vision models trained on human faces, and on tasks such as face attribute prediction and affect estimation.
- Score: 54.062512989859265
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In machine learning, disparity metrics are often defined by measuring the
difference in the performance or outcome of a model, across different
sub-populations (groups) of datapoints. Thus, the inputs to disparity
quantification consist of a model's predictions $\hat{y}$, the ground-truth
labels for the predictions $y$, and group labels $g$ for the data points.
Performance of the model for each group is calculated by comparing $\hat{y}$
and $y$ for the datapoints within a specific group, and as a result, disparity
of performance across the different groups can be calculated. In many real
world scenarios however, group labels ($g$) may not be available at scale
during training and validation time, or collecting them might not be feasible
or desirable as they could often be sensitive information. As a result,
evaluating disparity metrics across categorical groups would not be feasible.
On the other hand, in many scenarios noisy groupings may be obtainable using
some form of a proxy, which would allow measuring disparity metrics across
sub-populations. Here we explore performing such analysis on computer vision
models trained on human faces, and on tasks such as face attribute prediction
and affect estimation. Our experiments indicate that embeddings resulting from
an off-the-shelf face recognition model, could meaningfully serve as a proxy
for such estimation.
Related papers
- Accurately Classifying Out-Of-Distribution Data in Facial Recognition [0.0]
Real-life scenarios typically feature unseen data which is different from data in the training distribution.
This issue is most prevalent in social justice problems where data from under-represented groups may appear in the test data without representing an equal proportion of the training data.
We are interested in the following question: Can the performance of a neural network improve on facial images of out-of-distribution data when it is trained simultaneously on multiple datasets of in-distribution data?
arXiv Detail & Related papers (2024-04-05T03:51:19Z) - A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - De-biasing "bias" measurement [20.049916973204102]
We show that metrics used to measure group-wise model performance disparities are themselves statistically biased estimators of the underlying quantities they purport to represent.
We propose the "double-corrected" variance estimator, which provides unbiased estimates and uncertainty quantification of the variance of model performance across groups.
arXiv Detail & Related papers (2022-05-11T20:51:57Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Mandoline: Model Evaluation under Distribution Shift [8.007644303175395]
Machine learning models are often deployed in different settings than they were trained and validated on.
We develop Mandoline, a new evaluation framework that mitigates these issues.
Users write simple "slicing functions" - noisy, potentially correlated binary functions intended to capture possible axes of distribution shift.
arXiv Detail & Related papers (2021-07-01T17:57:57Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Discriminative, Generative and Self-Supervised Approaches for
Target-Agnostic Learning [8.666667951130892]
generative and self-supervised learning models are shown to perform well at the task.
Our derived theorem for the pseudo-likelihood theory also shows that they are related for inferring a joint distribution model.
arXiv Detail & Related papers (2020-11-12T15:03:40Z) - Interpretable Assessment of Fairness During Model Evaluation [1.2183405753834562]
We introduce a novel hierarchical clustering algorithm to detect heterogeneity among users in given sets of sub-populations.
We demonstrate the performance of the algorithm on real data from LinkedIn.
arXiv Detail & Related papers (2020-10-26T02:31:17Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.