Analyzing Islamophobic Discourse Using Semi-Coded Terms and LLMs
- URL: http://arxiv.org/abs/2503.18273v1
- Date: Mon, 24 Mar 2025 01:41:24 GMT
- Title: Analyzing Islamophobic Discourse Using Semi-Coded Terms and LLMs
- Authors: Raza Ul Mustafa, Roi Dupart, Gabrielle Smith, Noman Ashraf, Nathalie Japkowicz,
- Abstract summary: This paper performs a large-scale analysis of specialized, semi-coded Islamophobic terms such as (muzrat, pislam, mudslime, mohammedan, muzzies) floated on extremist social platforms.<n>Using Google Perspective API, we also find that Islamophobic text is more toxic compared to other kinds of hate speech.
- Score: 2.5081530863229307
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Islamophobia started evolving into a global phenomenon by attracting followers across the globe, particularly in Western societies. Thus, understanding Islamophobia's global spread and online dissemination is crucial. This paper performs a large-scale analysis of specialized, semi-coded Islamophobic terms such as (muzrat, pislam, mudslime, mohammedan, muzzies) floated on extremist social platforms, i.e., 4Chan, Gab, Telegram, etc. First, we use large language models (LLMs) to show their ability to understand these terms. Second, using Google Perspective API, we also find that Islamophobic text is more toxic compared to other kinds of hate speech. Finally, we use BERT topic modeling approach to extract different topics and Islamophobic discourse on these social platforms. Our findings indicate that LLMs understand these Out-Of-Vocabulary (OOV) slurs; however, measures are still required to control such discourse. Our topic modeling also indicates that Islamophobic text is found across various political, conspiratorial, and far-right movements and is particularly directed against Muslim immigrants. Taken altogether, we performed the first study on Islamophobic semi-coded terms and shed a global light on Islamophobia.
Related papers
- Tackling Social Bias against the Poor: A Dataset and Taxonomy on Aporophobia [12.92000261399319]
Aporophobia -- the societal bias against people living in poverty -- constitutes a major obstacle to designing, approving and implementing poverty-mitigation policies.
This work presents an initial step towards operationalizing the concept of aporophobia to identify and track harmful beliefs and discriminative actions against poor people on social media.
arXiv Detail & Related papers (2025-04-17T16:53:14Z) - What Large Language Models Do Not Talk About: An Empirical Study of Moderation and Censorship Practices [46.30336056625582]
This work investigates the extent to which Large Language Models refuse to answer or omit information when prompted on political topics.<n>Our analysis covers 14 state-of-the-art models from Western countries, China, and Russia, prompted in all six official United Nations (UN) languages.
arXiv Detail & Related papers (2025-04-04T09:09:06Z) - HP-BERT: A framework for longitudinal study of Hinduphobia on social media via LLMs [1.9376226959814953]
We present an abuse detection and sentiment analysis framework that offers a longitudinal analysis of Hinduphobia on X (Twitter) during and after the COVID-19 pandemic.
This framework assesses the prevalence and intensity of Hinduphobic discourse, capturing elements such as derogatory jokes and racist remarks.
Our study encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom.
arXiv Detail & Related papers (2025-01-07T23:22:05Z) - Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion [55.27025066199226]
This paper addresses the need for democratizing large language models (LLM) in the Arab world.<n>One practical objective for an Arabic LLM is to utilize an Arabic-specific vocabulary for the tokenizer that could speed up decoding.<n>Inspired by the vocabulary learning during Second Language (Arabic) Acquisition for humans, the released AraLLaMA employs progressive vocabulary expansion.
arXiv Detail & Related papers (2024-12-16T19:29:06Z) - MIMIC: Multimodal Islamophobic Meme Identification and Classification [1.2647816797166167]
Anti-Muslim hate speech has emerged within memes, characterized by context-dependent and rhetorical messages.<n>This work presents a novel dataset and proposes a classifier based on the Vision-and-Language Transformer (ViLT) specifically tailored to identify anti-Muslim hate within memes.
arXiv Detail & Related papers (2024-12-01T05:44:01Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Arabic Dataset for LLM Safeguard Evaluation [62.96160492994489]
This study explores the safety of large language models (LLMs) in Arabic with its linguistic and cultural complexities.<n>We present an Arab-region-specific safety evaluation dataset consisting of 5,799 questions, including direct attacks, indirect attacks, and harmless requests with sensitive words.
arXiv Detail & Related papers (2024-10-22T14:12:43Z) - Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models [19.54202714712677]
Religion as a socio-cultural system prescribes a set of beliefs and values for its followers.
Unlike gender, which says little about our values, religion prescribes a set of beliefs and values for its followers.
Major religions in the US and European countries are represented with more nuance.
Eastern religions like Hinduism and Buddhism are strongly stereotyped.
arXiv Detail & Related papers (2024-07-09T14:45:15Z) - Monitoring the evolution of antisemitic discourse on extremist social media using BERT [3.3037858066178662]
Racism and intolerance on social media contribute to a toxic online environment which may spill offline to foster hatred.
Tracking antisemitic themes and their associated terminology over time in online discussions could help monitor the sentiments of their participants.
arXiv Detail & Related papers (2024-02-06T20:34:49Z) - Explainable Identification of Hate Speech towards Islam using Graph Neural Networks [0.0]
This study introduces a novel paradigm using Graph Neural Networks (GNNs) to identify and explain hate speech towards Islam.
Our model leverages GNNs to understand the context and patterns of hate speech by connecting texts via pretrained NLP-generated word embeddings.
This highlights the potential of GNNs in combating online hate speech and fostering a safer, more inclusive online environment.
arXiv Detail & Related papers (2023-11-02T04:01:04Z) - From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language
Models [73.25963871034858]
We present the first large-scale computational investigation of dogwhistles.
We develop a typology of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles, and analyze their usage in historical U.S. politicians' speeches.
We show that harmful content containing dogwhistles avoids toxicity detection, highlighting online risks of such coded language.
arXiv Detail & Related papers (2023-05-26T18:00:57Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Understanding and Detecting Hateful Content using Contrastive Learning [0.9391375268580806]
This work contributes to research efforts to detect and understand hateful content on the Web.
We devise a methodology to identify a set of Antisemitic and Islamophobic hateful textual phrases.
We then use OpenAI's CLIP to identify images that are highly similar to our Antisemitic/Islamophobic textual phrases.
arXiv Detail & Related papers (2022-01-21T18:22:29Z) - "Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During
the COVID-19 Pandemic [2.5227595609842206]
COVID-19 pandemic has fueled a surge in anti-Asian xenophobia and prejudice.
We create and annotate a corpus of Twitter tweets using 2 experimental approaches to explore anti-Asian abusive and hate speech.
arXiv Detail & Related papers (2021-12-04T06:55:19Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.