Interpretable bias mitigation for textual data: Reducing gender bias in
patient notes while maintaining classification performance
- URL: http://arxiv.org/abs/2103.05841v1
- Date: Wed, 10 Mar 2021 03:09:30 GMT
- Title: Interpretable bias mitigation for textual data: Reducing gender bias in
patient notes while maintaining classification performance
- Authors: Joshua R. Minot, Nicholas Cheney, Marc Maier, Danne C. Elbers,
Christopher M. Danforth, and Peter Sheridan Dodds
- Abstract summary: We identify and remove gendered language from two clinical-note datasets.
We show minimal degradation in health condition classification tasks for low- to medium-levels of bias removal via data augmentation.
This work outlines an interpretable approach for using data augmentation to identify and reduce the potential for bias in natural language processing pipelines.
- Score: 0.11545092788508224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical systems in general, and patient treatment decisions and outcomes in
particular, are affected by bias based on gender and other demographic
elements. As language models are increasingly applied to medicine, there is a
growing interest in building algorithmic fairness into processes impacting
patient care. Much of the work addressing this question has focused on biases
encoded in language models -- statistical estimates of the relationships
between concepts derived from distant reading of corpora. Building on this
work, we investigate how word choices made by healthcare practitioners and
language models interact with regards to bias. We identify and remove gendered
language from two clinical-note datasets and describe a new debiasing procedure
using BERT-based gender classifiers. We show minimal degradation in health
condition classification tasks for low- to medium-levels of bias removal via
data augmentation. Finally, we compare the bias semantically encoded in the
language models with the bias empirically observed in health records. This work
outlines an interpretable approach for using data augmentation to identify and
reduce the potential for bias in natural language processing pipelines.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Locating and Mitigating Gender Bias in Large Language Models [40.78150878350479]
Large language models (LLM) are pre-trained on extensive corpora to learn facts and human cognition which contain human preferences.
This process can inadvertently lead to these models acquiring biases and prevalent stereotypes in society.
We propose the LSDM (Least Square Debias Method), a knowledge-editing based method for mitigating gender bias in occupational pronouns.
arXiv Detail & Related papers (2024-03-21T13:57:43Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - Naturalistic Causal Probing for Morpho-Syntax [76.83735391276547]
We suggest a naturalistic strategy for input-level intervention on real world data in Spanish.
Using our approach, we isolate morpho-syntactic features from counfounders in sentences.
We apply this methodology to analyze causal effects of gender and number on contextualized representations extracted from pre-trained models.
arXiv Detail & Related papers (2022-05-14T11:47:58Z) - Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic
Information Preserving [3.114945725130788]
We propose a novel methodology that leverages a causal inference framework to effectively remove gender bias.
Our comprehensive experiments show that the proposed method achieves state-of-the-art results in gender-debiasing tasks.
arXiv Detail & Related papers (2021-12-09T19:57:22Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z) - Impact of Gender Debiased Word Embeddings in Language Modeling [0.0]
Gender, race and social biases have been detected as evident examples of unfairness in applications of Natural Language Processing.
Recent studies have shown that the human-generated data used in training is an apparent factor of getting biases.
Current algorithms have also been proven to amplify biases from data.
arXiv Detail & Related papers (2021-05-03T14:45:10Z) - Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings [16.136832979324467]
We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset.
We identify dangerous latent relationships that are captured by the contextual word embeddings.
We evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks.
arXiv Detail & Related papers (2020-03-11T23:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.