From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language
Models
- URL: http://arxiv.org/abs/2305.17174v1
- Date: Fri, 26 May 2023 18:00:57 GMT
- Title: From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language
Models
- Authors: Julia Mendelsohn, Ronan Le Bras, Yejin Choi, Maarten Sap
- Abstract summary: We present the first large-scale computational investigation of dogwhistles.
We develop a typology of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles, and analyze their usage in historical U.S. politicians' speeches.
We show that harmful content containing dogwhistles avoids toxicity detection, highlighting online risks of such coded language.
- Score: 73.25963871034858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dogwhistles are coded expressions that simultaneously convey one meaning to a
broad audience and a second one, often hateful or provocative, to a narrow
in-group; they are deployed to evade both political repercussions and
algorithmic content moderation. For example, in the sentence 'we need to end
the cosmopolitan experiment,' the word 'cosmopolitan' likely means 'worldly' to
many, but secretly means 'Jewish' to a select few. We present the first
large-scale computational investigation of dogwhistles. We develop a typology
of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles
with rich contextual information and examples, and analyze their usage in
historical U.S. politicians' speeches. We then assess whether a large language
model (GPT-3) can identify dogwhistles and their meanings, and find that
GPT-3's performance varies widely across types of dogwhistles and targeted
groups. Finally, we show that harmful content containing dogwhistles avoids
toxicity detection, highlighting online risks of such coded language. This work
sheds light on the theoretical and applied importance of dogwhistles in both
NLP and computational social science, and provides resources for future
research in modeling dogwhistles and mitigating their online harms.
Related papers
- One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks [55.35278531907263]
We present the first study on Large Language Models' fairness and robustness to a dialect in canonical reasoning tasks.
We hire AAVE speakers to rewrite seven popular benchmarks, such as HumanEval and GSM8K.
We find that, compared to Standardized English, almost all of these widely used models show significant brittleness and unfairness to queries in AAVE.
arXiv Detail & Related papers (2024-10-14T18:44:23Z) - Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights [50.89022445197919]
We propose a speech-specific risk taxonomy, covering 8 risk categories under hostility (malicious sarcasm and threats), malicious imitation (age, gender, ethnicity), and stereotypical biases (age, gender, ethnicity)
Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk.
arXiv Detail & Related papers (2024-06-25T10:08:45Z) - Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles [47.61526125774749]
A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination.
We present an approach for word-sense disambiguation of dog whistles from standard speech using Large Language Models (LLMs)
We leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication.
arXiv Detail & Related papers (2024-06-10T23:09:19Z) - Verbreitungsmechanismen sch\"adigender Sprache im Netz: Anatomie zweier
Shitstorms [0.9898607871253772]
We focus on two exemplary, cross-media shitstorms directed against well-known individuals from the business world.
Both have in common, first, the trigger, a controversial statement by the person who becomes the target of the shitstorm.
We examine the spread of the outrage wave across two media at a time and test the applicability of computational linguistic methods for analyzing its time course.
arXiv Detail & Related papers (2023-12-12T12:00:04Z) - What Do Llamas Really Think? Revealing Preference Biases in Language
Model Representations [62.91799637259657]
Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond?
We study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations.
We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors.
arXiv Detail & Related papers (2023-11-30T18:53:13Z) - Diagnosing and Debiasing Corpus-Based Political Bias and Insults in GPT2 [0.0]
Training large language models (LLMs) on extensive, unfiltered corpora sourced from the internet is a common and advantageous practice.
Recent research shows that generative pretrained transformer (GPT) language models can recognize their own biases and detect toxicity in generated content.
This study investigates the efficacy of the diagnosing-debiasing approach in mitigating two additional types of biases: insults and political bias.
arXiv Detail & Related papers (2023-11-17T01:20:08Z) - A Group-Specific Approach to NLP for Hate Speech Detection [2.538209532048867]
We propose a group-specific approach to NLP for online hate speech detection.
We analyze historical data about discrimination against a protected group to better predict spikes in hate speech against that group.
We demonstrate this approach through a case study on NLP for detection of antisemitic hate speech.
arXiv Detail & Related papers (2023-04-21T19:08:49Z) - Hate versus Politics: Detection of Hate against Policy makers in Italian
tweets [0.6289422225292998]
This paper addresses the issue of classification of hate speech against policy makers from Twitter in Italian.
We collected and annotated 1264 tweets, examined the cases of disagreements between annotators, and performed in-domain and cross-domain hate speech classifications.
We achieved a performance of ROC AUC 0.83 and analyzed the most predictive attributes, also finding the different language features in the anti-policymakers and anti-immigration domains.
arXiv Detail & Related papers (2021-07-12T12:24:45Z) - Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with
Common Sense and World Knowledge [49.288196234823005]
Cant is important for understanding advertising, comedies and dog-whistle politics.
We propose a large and diverse Chinese dataset for creating and understanding cant.
arXiv Detail & Related papers (2021-04-06T17:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.