Toward Fairness in Text Generation via Mutual Information Minimization
based on Importance Sampling
- URL: http://arxiv.org/abs/2302.13136v1
- Date: Sat, 25 Feb 2023 18:29:02 GMT
- Title: Toward Fairness in Text Generation via Mutual Information Minimization
based on Importance Sampling
- Authors: Rui Wang, Pengyu Cheng, Ricardo Henao
- Abstract summary: We propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity.
In this way, the mentioning of a demographic group is encouraged to be independent from how it is described in the generated text.
We also propose a distillation mechanism that preserves the language modeling ability of the PLMs after debiasing.
- Score: 23.317845744611375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained language models (PLMs), such as GPT2, have achieved remarkable
empirical performance in text generation tasks. However, pretrained on
large-scale natural language corpora, the generated text from PLMs may exhibit
social bias against disadvantaged demographic groups. To improve the fairness
of PLMs in text generation, we propose to minimize the mutual information
between the semantics in the generated text sentences and their demographic
polarity, i.e., the demographic group to which the sentence is referring. In
this way, the mentioning of a demographic group (e.g., male or female) is
encouraged to be independent from how it is described in the generated text,
thus effectively alleviating the social bias. Moreover, we propose to
efficiently estimate the upper bound of the above mutual information via
importance sampling, leveraging a natural language corpus. We also propose a
distillation mechanism that preserves the language modeling ability of the PLMs
after debiasing. Empirical results on real-world benchmarks demonstrate that
the proposed method yields superior performance in term of both fairness and
language modeling ability.
Related papers
- LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models [19.18522268167047]
Large language models (LLMs) have achieved impressive performance on various natural language generation tasks.
However, they suffer from generating negative and harmful contents that are biased against certain demographic groups.
We propose LIDAO, a framework to debias a (L)LM at a better fluency provably.
arXiv Detail & Related papers (2024-06-01T20:12:54Z) - All Should Be Equal in the Eyes of Language Models: Counterfactually
Aware Fair Text Generation [16.016546693767403]
We propose a framework that dynamically compares the model understanding of diverse demographics to generate more equitable sentences.
CAFIE produces fairer text and strikes the best balance between fairness and language modeling capability.
arXiv Detail & Related papers (2023-11-09T15:39:40Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - COFFEE: Counterfactual Fairness for Personalized Text Generation in
Explainable Recommendation [56.520470678876656]
bias inherent in user written text can associate different levels of linguistic quality with users' protected attributes.
We introduce a general framework to achieve measure-specific counterfactual fairness in explanation generation.
arXiv Detail & Related papers (2022-10-14T02:29:10Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.