Related papers: PACO: Provocation Involving Action, Culture, and Oppression

PACO: Provocation Involving Action, Culture, and Oppression

URL: http://arxiv.org/abs/2303.12808v1
Date: Sun, 19 Mar 2023 04:39:36 GMT
Title: PACO: Provocation Involving Action, Culture, and Oppression
Authors: Vaibhav Garg, Ganning Xu, and Munindar P. Singh
Abstract summary: In India, people identify with a particular group based on certain attributes such as religion. The same religious groups are often provoked against each other. Previous studies show the role of provocation in increasing tensions between India's two prominent religious groups: Hindus and Muslims.
Score: 13.70482307997736
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In India, people identify with a particular group based on certain attributes such as religion. The same religious groups are often provoked against each other. Previous studies show the role of provocation in increasing tensions between India's two prominent religious groups: Hindus and Muslims. With the advent of the Internet, such provocation also surfaced on social media platforms such as WhatsApp. By leveraging an existing dataset of Indian WhatsApp posts, we identified three categories of provoking sentences against Indian Muslims. Further, we labeled 7,000 sentences for three provocation categories and called this dataset PACO. We leveraged PACO to train a model that can identify provoking sentences from a WhatsApp post. Our best model is fine-tuned RoBERTa and achieved a 0.851 average AUC score over five-fold cross-validation. Automatically identifying provoking sentences could stop provoking text from reaching out to the masses, and can prevent possible discrimination or violence against the target religious group. Further, we studied the provocative speech through a pragmatic lens, by identifying the dialog acts and impoliteness super-strategies used against the religious group.

Related papers

HatePRISM: Policies, Platforms, and Research Integration. Advancing NLP for Hate Speech Proactive Mitigation [67.69631485036665]
We conduct a comprehensive examination of hate speech regulations and strategies from three perspectives.<n>Our findings reveal significant inconsistencies in hate speech definitions and moderation practices across jurisdictions.<n>We suggest ideas and research direction for further exploration of a unified framework for automated hate speech moderation.
arXiv Detail & Related papers (2025-07-06T11:25:23Z)
HP-BERT: A framework for longitudinal study of Hinduphobia on social media via LLMs [1.9376226959814953]
We present an abuse detection and sentiment analysis framework that offers a longitudinal analysis of Hinduphobia on X (Twitter) during and after the COVID-19 pandemic. This framework assesses the prevalence and intensity of Hinduphobic discourse, capturing elements such as derogatory jokes and racist remarks. Our study encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom.
arXiv Detail & Related papers (2025-01-07T23:22:05Z)
On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes. Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z)
Understanding writing style in social media with a supervised contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation. We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts. Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z)
Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages [0.6526824510982802]
This work delves into the domain of hate speech detection, placing specific emphasis on three low-resource Indian languages: Bengali, Assamese, and Gujarati. The challenge is framed as a text classification task, aimed at discerning whether a tweet contains offensive or non-offensive content. We fine-tuned pre-trained BERT and SBERT models to evaluate their effectiveness in identifying hate speech.
arXiv Detail & Related papers (2023-10-03T17:53:09Z)
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration [75.62448812759968]
This dataset is a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines.
arXiv Detail & Related papers (2023-05-28T11:51:20Z)
Religion and Spirituality on Social Media in the Aftermath of the Global Pandemic [59.930429668324294]
We analyse the sudden change in religious activities twofold: we create and deliver a questionnaire, as well as analyse Twitter data. Importantly, we also analyse the temporal variations in this process by analysing a period of 3 months: July-September 2020.
arXiv Detail & Related papers (2022-12-11T18:41:02Z)
AlexU-AIC at Arabic Hate Speech 2022: Contrast to Classify [2.9220076568786326]
We present our submission to the Arabic Hate Speech 2022 Shared Task Workshop (OSACT5 2022) using the associated Arabic Twitter dataset. For offensive Tweets, sub-task B focuses on detecting whether the tweet is hate speech or not. For hate speech Tweets, sub-task C focuses on detecting the fine-grained type of hate speech among six different classes.
arXiv Detail & Related papers (2022-07-18T12:33:51Z)
COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences. We also propose textscCOLDetector to study output offensiveness of popular Chinese language models. Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z)
"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic [2.5227595609842206]
COVID-19 pandemic has fueled a surge in anti-Asian xenophobia and prejudice. We create and annotate a corpus of Twitter tweets using 2 experimental approaches to explore anti-Asian abusive and hate speech.
arXiv Detail & Related papers (2021-12-04T06:55:19Z)
"Short is the Road that Leads from Fear to Hate": Fear Speech in Indian WhatsApp Groups [8.682669903229165]
We perform the first large scale study on fear speech across thousands of public WhatsApp groups discussing politics in India. We build models to classify fear speech and observe that current state-of-the-art NLP models do not perform well at this task.
arXiv Detail & Related papers (2021-02-07T18:14:16Z)
Persistent Anti-Muslim Bias in Large Language Models [13.984800635696566]
GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to "money" in 5% of test cases.
arXiv Detail & Related papers (2021-01-14T18:41:55Z)
Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities. We study the evolution and spread of anti-Asian hate speech through the lens of Twitter. We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.