Threshold-Independent Fair Matching through Score Calibration
- URL: http://arxiv.org/abs/2405.20051v1
- Date: Thu, 30 May 2024 13:37:53 GMT
- Title: Threshold-Independent Fair Matching through Score Calibration
- Authors: Mohammad Hossein Moslemi, Mostafa Milani,
- Abstract summary: We introduce a new approach in Entity Matching (EM) using recent metrics for evaluating biases in score based binary classification.
This approach enables the application of various bias metrics like equalized odds, equal opportunity, and demographic parity without depending on threshold settings.
This paper contributes to the field of fairness in data cleaning, especially within EM, by promoting a method for generating matching scores that reduce biases across different thresholds.
- Score: 1.5530839016602822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entity Matching (EM) is a critical task in numerous fields, such as healthcare, finance, and public administration, as it identifies records that refer to the same entity within or across different databases. EM faces considerable challenges, particularly with false positives and negatives. These are typically addressed by generating matching scores and apply thresholds to balance false positives and negatives in various contexts. However, adjusting these thresholds can affect the fairness of the outcomes, a critical factor that remains largely overlooked in current fair EM research. The existing body of research on fair EM tends to concentrate on static thresholds, neglecting their critical impact on fairness. To address this, we introduce a new approach in EM using recent metrics for evaluating biases in score based binary classification, particularly through the lens of distributional parity. This approach enables the application of various bias metrics like equalized odds, equal opportunity, and demographic parity without depending on threshold settings. Our experiments with leading matching methods reveal potential biases, and by applying a calibration technique for EM scores using Wasserstein barycenters, we not only mitigate these biases but also preserve accuracy across real world datasets. This paper contributes to the field of fairness in data cleaning, especially within EM, which is a central task in data cleaning, by promoting a method for generating matching scores that reduce biases across different thresholds.
Related papers
- Mitigating Matching Biases Through Score Calibration [1.5530839016602822]
Biased outcomes in record matching can result in unequal error rates across demographic groups, raising ethical and legal concerns.
In this paper, we adapt fairness metrics traditionally applied in regression models to evaluate cumulative bias across all thresholds in record matching.
We propose a novel post-processing calibration method, leveraging optimal transport theory and Wasserstein barycenters, to balance matching scores across demographic groups.
arXiv Detail & Related papers (2024-11-03T21:01:40Z) - Through the Fairness Lens: Experimental Analysis and Evaluation of
Entity Matching [17.857838691801884]
Algorithmic fairness has become a timely topic to address machine bias and its societal impacts.
Despite extensive research on these two topics, little attention has been paid to the fairness of entity matching.
We generate two social datasets for the purpose of auditing EM through the lens of fairness.
arXiv Detail & Related papers (2023-07-06T02:21:08Z) - When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness [8.367620276482056]
We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions.
Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods.
These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.
arXiv Detail & Related papers (2023-02-14T16:53:52Z) - Practical Approaches for Fair Learning with Multitype and Multivariate
Sensitive Attributes [70.6326967720747]
It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences.
We introduce FairCOCCO, a fairness measure built on cross-covariance operators on reproducing kernel Hilbert Spaces.
We empirically demonstrate consistent improvements against state-of-the-art techniques in balancing predictive power and fairness on real-world datasets.
arXiv Detail & Related papers (2022-11-11T11:28:46Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Debiasing Neural Retrieval via In-batch Balancing Regularization [25.941718123899356]
We develop a differentiable textitnormed Pairwise Ranking Fairness (nPRF) and leverage the T-statistics on top of nPRF to improve fairness.
Our method with nPRF achieves significantly less bias with minimal degradation in ranking performance compared with the baseline.
arXiv Detail & Related papers (2022-05-18T22:57:15Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Domain-Incremental Continual Learning for Mitigating Bias in Facial
Expression and Action Unit Recognition [5.478764356647437]
We propose the novel usage of Continual Learning (CL) as a potent bias mitigation method to enhance the fairness of FER systems.
We compare different non-CL-based and CL-based methods for their classification accuracy and fairness scores on expression recognition and Action Unit (AU) detection tasks.
Our experimental results show that CL-based methods, on average, outperform other popular bias mitigation techniques on both accuracy and fairness metrics.
arXiv Detail & Related papers (2021-03-15T18:22:17Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.