Automated Sentiment and Hate Speech Analysis of Facebook Data by
Employing Multilingual Transformer Models
- URL: http://arxiv.org/abs/2301.13668v1
- Date: Tue, 31 Jan 2023 14:37:04 GMT
- Title: Automated Sentiment and Hate Speech Analysis of Facebook Data by
Employing Multilingual Transformer Models
- Authors: Ritumbra Manuvie and Saikat Chatterjee
- Abstract summary: We analyse the statistical distribution of hateful and negative sentiment contents within a representative Facebook dataset.
We employ state-of-the-art, open-source XLM-T multilingual transformer-based language models to perform sentiment and hate speech analysis.
- Score: 15.823923425516078
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, there has been a heightened consensus within academia and in
the public discourse that Social Media Platforms (SMPs), amplify the spread of
hateful and negative sentiment content. Researchers have identified how hateful
content, political propaganda, and targeted messaging contributed to real-world
harms including insurrections against democratically elected governments,
genocide, and breakdown of social cohesion due to heightened negative discourse
towards certain communities in parts of the world. To counter these issues,
SMPs have created semi-automated systems that can help identify toxic speech.
In this paper we analyse the statistical distribution of hateful and negative
sentiment contents within a representative Facebook dataset (n= 604,703)
scrapped through 648 public Facebook pages which identify themselves as
proponents (and followers) of far-right Hindutva actors. These pages were
identified manually using keyword searches on Facebook and on CrowdTangleand
classified as far-right Hindutva pages based on page names, page descriptions,
and discourses shared on these pages. We employ state-of-the-art, open-source
XLM-T multilingual transformer-based language models to perform sentiment and
hate speech analysis of the textual contents shared on these pages over a
period of 5.5 years. The result shows the statistical distributions of the
predicted sentiment and the hate speech labels; top actors, and top page
categories. We further discuss the benchmark performances and limitations of
these pre-trained language models.
Related papers
- Analysis and Detection of Multilingual Hate Speech Using Transformer
Based Deep Learning [7.332311991395427]
As the prevalence of hate speech increases online, the demand for automated detection as an NLP task is increasing.
In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc.
The Gold standard datasets were collected from renowned researcher Zeerak Talat, Sara Tonelli, Melanie Siegel, and Rezaul Karim.
The success rate of the proposed model for hate speech detection is higher than the existing baseline and state-of-the-art models with accuracy in Bengali dataset is 89%, in English: 91%, in German
arXiv Detail & Related papers (2024-01-19T20:40:23Z) - Lexical Squad@Multimodal Hate Speech Event Detection 2023: Multimodal
Hate Speech Detection using Fused Ensemble Approach [0.23020018305241333]
We present our novel ensemble learning approach for detecting hate speech, by classifying text-embedded images into two labels, namely "Hate Speech" and "No Hate Speech"
Our proposed ensemble model yielded promising results with 75.21 and 74.96 as accuracy and F-1 score (respectively)
arXiv Detail & Related papers (2023-09-23T12:06:05Z) - Topological Data Mapping of Online Hate Speech, Misinformation, and
General Mental Health: A Large Language Model Based Study [6.803493330690884]
Recent advances in machine learning and large language models have made such an analysis possible.
In this study, we collected thousands of posts from carefully selected communities on the social media site Reddit.
We performed various machine-learning classifications based on embeddings in order to understand the role of hate speech/misinformation in various communities.
arXiv Detail & Related papers (2023-09-22T15:10:36Z) - The Face of Populism: Examining Differences in Facial Emotional
Expressions of Political Leaders Using Machine Learning [57.70351255180495]
We apply a deep-learning-based computer-vision algorithm to a sample of 220 YouTube videos depicting political leaders from 15 different countries.
We observe statistically significant differences in the average score of expressed negative emotions between groups of leaders with varying degrees of populist rhetoric.
arXiv Detail & Related papers (2023-04-19T18:32:49Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter.
This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z) - Nipping in the Bud: Detection, Diffusion and Mitigation of Hate Speech
on Social Media [21.47216483704825]
This article presents methodological challenges that hinder building automated hate mitigation systems.
We discuss a series of our proposed solutions to limit the spread of hate speech on social media.
arXiv Detail & Related papers (2022-01-04T03:44:46Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - Echo Chambers on Social Media: A comparative analysis [64.2256216637683]
We introduce an operational definition of echo chambers and perform a massive comparative analysis on 1B pieces of contents produced by 1M users on four social media platforms.
We infer the leaning of users about controversial topics and reconstruct their interaction networks by analyzing different features.
We find support for the hypothesis that platforms implementing news feed algorithms like Facebook may elicit the emergence of echo-chambers.
arXiv Detail & Related papers (2020-04-20T20:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.