A Framework for the Computational Linguistic Analysis of Dehumanization
- URL: http://arxiv.org/abs/2003.03014v2
- Date: Wed, 17 Jun 2020 20:00:19 GMT
- Title: A Framework for the Computational Linguistic Analysis of Dehumanization
- Authors: Julia Mendelsohn, Yulia Tsvetkov, Dan Jurafsky
- Abstract summary: We analyze discussions of LGBTQ people in the New York Times from 1986 to 2015.
We find increasingly humanizing descriptions of LGBTQ people over time.
The ability to analyze dehumanizing language at a large scale has implications for automatically detecting and understanding media bias as well as abusive language online.
- Score: 52.735780962665814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dehumanization is a pernicious psychological process that often leads to
extreme intergroup bias, hate speech, and violence aimed at targeted social
groups. Despite these serious consequences and the wealth of available data,
dehumanization has not yet been computationally studied on a large scale.
Drawing upon social psychology research, we create a computational linguistic
framework for analyzing dehumanizing language by identifying linguistic
correlates of salient components of dehumanization. We then apply this
framework to analyze discussions of LGBTQ people in the New York Times from
1986 to 2015. Overall, we find increasingly humanizing descriptions of LGBTQ
people over time. However, we find that the label homosexual has emerged to be
much more strongly associated with dehumanizing attitudes than other labels,
such as gay. Our proposed techniques highlight processes of linguistic
variation and change in discourses surrounding marginalized groups.
Furthermore, the ability to analyze dehumanizing language at a large scale has
implications for automatically detecting and understanding media bias as well
as abusive language online.
Related papers
- QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities [4.82206141686275]
We assess the potential harm caused by sentence completions generated by English large language models concerning LGBTQIA+ individuals.
The analysis indicates that large language models tend to exhibit discriminatory behaviour more frequently towards individuals within the LGBTQIA+ community.
arXiv Detail & Related papers (2024-06-18T08:40:29Z) - Beyond Hate Speech: NLP's Challenges and Opportunities in Uncovering
Dehumanizing Language [11.946719280041789]
This paper evaluates the performance of cutting-edge NLP models, including GPT-4, GPT-3.5, and LLAMA-2 in identifying dehumanizing language.
Our findings reveal that while these models demonstrate potential, achieving a 70% accuracy rate in distinguishing dehumanizing language from broader hate speech, they also display biases.
arXiv Detail & Related papers (2024-02-21T13:57:36Z) - A Dataset for the Detection of Dehumanizing Language [3.2803526084968895]
We present two data sets of dehumanizing text, a large, automatically collected corpus and a smaller, manually annotated data set.
Our methods give us a broad and varied amount of dehumanization data to work with, enabling further exploratory analysis and automatic classification of dehumanization patterns.
arXiv Detail & Related papers (2024-02-13T19:58:24Z) - Developing Linguistic Patterns to Mitigate Inherent Human Bias in
Offensive Language Detection [1.6574413179773761]
We propose a linguistic data augmentation approach to reduce bias in labeling processes.
This approach has the potential to improve offensive language classification tasks across multiple languages.
arXiv Detail & Related papers (2023-12-04T10:20:36Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech
Emotion Recognition [48.29355616574199]
We analyze the transferability of emotion recognition across three different languages--English, Mandarin Chinese, and Cantonese.
This study concludes that different language and age groups require specific speech features, thus making cross-lingual inference an unsuitable method.
arXiv Detail & Related papers (2023-06-26T08:48:08Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Revealing Persona Biases in Dialogue Systems [64.96908171646808]
We present the first large-scale study on persona biases in dialogue systems.
We conduct analyses on personas of different social classes, sexual orientations, races, and genders.
In our studies of the Blender and DialoGPT dialogue systems, we show that the choice of personas can affect the degree of harms in generated responses.
arXiv Detail & Related papers (2021-04-18T05:44:41Z) - Multilingual Contextual Affective Analysis of LGBT People Portrayals in
Wikipedia [34.183132688084534]
Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions.
We show how word connotations differ across languages and cultures, highlighting the difficulty of generalizing existing English datasets.
We then demonstrate the usefulness of our method by analyzing Wikipedia biography pages of members of the LGBT community across three languages.
arXiv Detail & Related papers (2020-10-21T08:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.