Bias in Evaluation Processes: An Optimization-Based Model
- URL: http://arxiv.org/abs/2310.17489v1
- Date: Thu, 26 Oct 2023 15:45:01 GMT
- Title: Bias in Evaluation Processes: An Optimization-Based Model
- Authors: L. Elisa Celis and Amit Kumar and Anay Mehrotra and Nisheeth K.
Vishnoi
- Abstract summary: We model an evaluation process as a transformation of a distribution of the true utility of an individual for a task to an observed distribution.
We characterize the distributions that arise from our model and study the effect of the parameters on the observed distribution.
We empirically validate our model by fitting real-world datasets and use it to study the effect of interventions in a downstream selection task.
- Score: 31.790546767744917
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biases with respect to socially-salient attributes of individuals have been
well documented in evaluation processes used in settings such as admissions and
hiring. We view such an evaluation process as a transformation of a
distribution of the true utility of an individual for a task to an observed
distribution and model it as a solution to a loss minimization problem subject
to an information constraint. Our model has two parameters that have been
identified as factors leading to biases: the resource-information trade-off
parameter in the information constraint and the risk-averseness parameter in
the loss function. We characterize the distributions that arise from our model
and study the effect of the parameters on the observed distribution. The
outputs of our model enrich the class of distributions that can be used to
capture variation across groups in the observed evaluations. We empirically
validate our model by fitting real-world datasets and use it to study the
effect of interventions in a downstream selection task. These results
contribute to an understanding of the emergence of bias in evaluation processes
and provide tools to guide the deployment of interventions to mitigate biases.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Variational Inference of Parameters in Opinion Dynamics Models [9.51311391391997]
This work uses variational inference to estimate the parameters of an opinion dynamics ABM.
We transform the inference process into an optimization problem suitable for automatic differentiation.
Our approach estimates both macroscopic (bounded confidence intervals and backfire thresholds) and microscopic ($200$ categorical, agent-level roles) more accurately than simulation-based and MCMC methods.
arXiv Detail & Related papers (2024-03-08T14:45:18Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - An Information-theoretic Approach to Distribution Shifts [9.475039534437332]
Safely deploying machine learning models to the real world is often a challenging process.
Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere.
neural networks that are fit to a subset of the population might carry some selection bias into their decision process.
arXiv Detail & Related papers (2021-06-07T16:44:21Z) - Model Selection's Disparate Impact in Real-World Deep Learning
Applications [3.924854655504237]
Algorithmic fairness has emphasized the role of biased data in automated decision outcomes.
We contend that one source of such bias, human preferences in model selection, remains under-explored in terms of its role in disparate impact across demographic groups.
arXiv Detail & Related papers (2021-04-01T16:37:01Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z) - Mind the Trade-off: Debiasing NLU Models without Degrading the
In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization.
It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.
We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z) - Identification Methods With Arbitrary Interventional Distributions as
Inputs [8.185725740857595]
Causal inference quantifies cause-effect relationships by estimating counterfactual parameters from data.
We use Single World Intervention Graphs and a nested factorization of models associated with mixed graphs to give a very simple view of existing identification theory for experimental data.
arXiv Detail & Related papers (2020-04-02T17:27:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.