Societal Biases in Language Generation: Progress and Challenges
- URL: http://arxiv.org/abs/2105.04054v1
- Date: Mon, 10 May 2021 00:17:33 GMT
- Title: Societal Biases in Language Generation: Progress and Challenges
- Authors: Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng
- Abstract summary: Language generation presents unique challenges in terms of direct user interaction and the structure of decoding techniques.
We present a survey on societal biases in language generation, focusing on how techniques contribute to biases and on progress towards bias analysis and mitigation.
Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques.
- Score: 43.06301135908934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Technology for language generation has advanced rapidly, spurred by
advancements in pre-training large models on massive amounts of data and the
need for intelligent agents to communicate in a natural manner. While
techniques can effectively generate fluent text, they can also produce
undesirable societal biases that can have a disproportionately negative impact
on marginalized populations. Language generation presents unique challenges in
terms of direct user interaction and the structure of decoding techniques. To
better understand these challenges, we present a survey on societal biases in
language generation, focusing on how techniques contribute to biases and on
progress towards bias analysis and mitigation. Motivated by a lack of studies
on biases from decoding techniques, we also conduct experiments to quantify the
effects of these techniques. By further discussing general trends and open
challenges, we call to attention promising directions for research and the
importance of fairness and inclusivity considerations for language generation
applications.
Related papers
- Detection of Machine-Generated Text: Literature Survey [0.0]
This literature survey aims to compile and synthesize accomplishments and developments in the field of machine-generated text.
It also gives an overview of machine-generated text trends and explores the larger societal implications.
arXiv Detail & Related papers (2024-01-02T01:44:15Z) - Developing Linguistic Patterns to Mitigate Inherent Human Bias in
Offensive Language Detection [1.6574413179773761]
We propose a linguistic data augmentation approach to reduce bias in labeling processes.
This approach has the potential to improve offensive language classification tasks across multiple languages.
arXiv Detail & Related papers (2023-12-04T10:20:36Z) - Factuality Challenges in the Era of Large Language Models [113.3282633305118]
Large Language Models (LLMs) generate false, erroneous, or misleading content.
LLMs can be exploited for malicious applications.
This poses a significant challenge to society in terms of the potential deception of users.
arXiv Detail & Related papers (2023-10-08T14:55:02Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - On the application of Large Language Models for language teaching and
assessment technology [18.735612275207853]
We look at the potential for incorporating large language models in AI-driven language teaching and assessment systems.
We find that larger language models offer improvements over previous models in text generation.
For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results.
arXiv Detail & Related papers (2023-07-17T11:12:56Z) - Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language
Models [11.323961700172175]
This article investigates the challenges and risks associated with biases in large-scale language models like ChatGPT.
We discuss the origins of biases, stemming from, among others, the nature of training data, model specifications, algorithmic constraints, product design, and policy decisions.
We review the current approaches to identify, quantify, and mitigate biases in language models, emphasizing the need for a multi-disciplinary, collaborative effort to develop more equitable, transparent, and responsible AI systems.
arXiv Detail & Related papers (2023-04-07T17:14:00Z) - Why is constrained neural language generation particularly challenging? [13.62873478165553]
We present an extensive survey on the emerging topic of constrained neural language generation.
We distinguish between conditions and constraints, present constrained text generation tasks, and review existing methods and evaluation metrics for constrained text generation.
Our aim is to highlight recent progress and trends in this emerging field, informing on the most promising directions and limitations towards advancing the state-of-the-art of constrained neural language generation research.
arXiv Detail & Related papers (2022-06-11T02:07:33Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Survey of Hallucination in Natural Language Generation [69.9926849848132]
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies.
Deep learning based generation is prone to hallucinate unintended text, which degrades the system performance.
This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
arXiv Detail & Related papers (2022-02-08T03:55:01Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.