All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation
- URL: http://arxiv.org/abs/2311.05451v1
- Date: Thu, 9 Nov 2023 15:39:40 GMT
- Title: All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation
- Authors: Pragyan Banerjee, Abhinav Java, Surgan Jandial, Simra Shahid, Shaz
Furniturewala, Balaji Krishnamurthy, Sumit Bhatia
- Abstract summary: We propose a framework that dynamically compares the model understanding of diverse demographics to generate more equitable sentences.
CAFIE produces fairer text and strikes the best balance between fairness and language modeling capability.
- Score: 16.016546693767403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fairness in Language Models (LMs) remains a longstanding challenge, given the
inherent biases in training data that can be perpetuated by models and affect
the downstream tasks. Recent methods employ expensive retraining or attempt
debiasing during inference by constraining model outputs to contrast from a
reference set of biased templates or exemplars. Regardless, they dont address
the primary goal of fairness to maintain equitability across different
demographic groups. In this work, we posit that inferencing LMs to generate
unbiased output for one demographic under a context ensues from being aware of
outputs for other demographics under the same context. To this end, we propose
Counterfactually Aware Fair InferencE (CAFIE), a framework that dynamically
compares the model understanding of diverse demographics to generate more
equitable sentences. We conduct an extensive empirical evaluation using base
LMs of varying sizes and across three diverse datasets and found that CAFIE
outperforms strong baselines. CAFIE produces fairer text and strikes the best
balance between fairness and language modeling capability
Related papers
- Detecting Bias in Large Language Models: Fine-tuned KcBERT [0.0]
We define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments.
Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics.
arXiv Detail & Related papers (2024-03-16T02:27:19Z) - TIDE: Textual Identity Detection for Evaluating and Augmenting
Classification and Language Models [0.0]
Machine learning models can perpetuate unintended biases from unfair and imbalanced datasets.
We present a dataset coupled with an approach to improve text fairness in classifiers and language models.
We leverage TIDAL to develop an identity annotation and augmentation tool that can be used to improve the availability of identity context.
arXiv Detail & Related papers (2023-09-07T21:44:42Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - DeAR: Debiasing Vision-Language Models with Additive Residuals [5.672132510411465]
Large pre-trained vision-language models (VLMs) provide rich, adaptable image and text representations.
These models suffer from societal biases owing to the skewed distribution of various identity groups in the training data.
We present DeAR, a novel debiasing method that learns additive residual image representations to offset the original representations.
arXiv Detail & Related papers (2023-03-18T14:57:43Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Toward Fairness in Text Generation via Mutual Information Minimization
based on Importance Sampling [23.317845744611375]
We propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity.
In this way, the mentioning of a demographic group is encouraged to be independent from how it is described in the generated text.
We also propose a distillation mechanism that preserves the language modeling ability of the PLMs after debiasing.
arXiv Detail & Related papers (2023-02-25T18:29:02Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.