All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation
- URL: http://arxiv.org/abs/2311.05451v1
- Date: Thu, 9 Nov 2023 15:39:40 GMT
- Title: All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation
- Authors: Pragyan Banerjee, Abhinav Java, Surgan Jandial, Simra Shahid, Shaz
Furniturewala, Balaji Krishnamurthy, Sumit Bhatia
- Abstract summary: We propose a framework that dynamically compares the model understanding of diverse demographics to generate more equitable sentences.
CAFIE produces fairer text and strikes the best balance between fairness and language modeling capability.
- Score: 16.016546693767403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fairness in Language Models (LMs) remains a longstanding challenge, given the
inherent biases in training data that can be perpetuated by models and affect
the downstream tasks. Recent methods employ expensive retraining or attempt
debiasing during inference by constraining model outputs to contrast from a
reference set of biased templates or exemplars. Regardless, they dont address
the primary goal of fairness to maintain equitability across different
demographic groups. In this work, we posit that inferencing LMs to generate
unbiased output for one demographic under a context ensues from being aware of
outputs for other demographics under the same context. To this end, we propose
Counterfactually Aware Fair InferencE (CAFIE), a framework that dynamically
compares the model understanding of diverse demographics to generate more
equitable sentences. We conduct an extensive empirical evaluation using base
LMs of varying sizes and across three diverse datasets and found that CAFIE
outperforms strong baselines. CAFIE produces fairer text and strikes the best
balance between fairness and language modeling capability
Related papers
- Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.
We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Detecting Bias in Large Language Models: Fine-tuned KcBERT [0.0]
We define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments.
Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics.
arXiv Detail & Related papers (2024-03-16T02:27:19Z) - TIDE: Textual Identity Detection for Evaluating and Augmenting
Classification and Language Models [0.0]
Machine learning models can perpetuate unintended biases from unfair and imbalanced datasets.
We present a dataset coupled with an approach to improve text fairness in classifiers and language models.
We leverage TIDAL to develop an identity annotation and augmentation tool that can be used to improve the availability of identity context.
arXiv Detail & Related papers (2023-09-07T21:44:42Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - DeAR: Debiasing Vision-Language Models with Additive Residuals [5.672132510411465]
Large pre-trained vision-language models (VLMs) provide rich, adaptable image and text representations.
These models suffer from societal biases owing to the skewed distribution of various identity groups in the training data.
We present DeAR, a novel debiasing method that learns additive residual image representations to offset the original representations.
arXiv Detail & Related papers (2023-03-18T14:57:43Z) - Toward Fairness in Text Generation via Mutual Information Minimization
based on Importance Sampling [23.317845744611375]
We propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity.
In this way, the mentioning of a demographic group is encouraged to be independent from how it is described in the generated text.
We also propose a distillation mechanism that preserves the language modeling ability of the PLMs after debiasing.
arXiv Detail & Related papers (2023-02-25T18:29:02Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.