HP-BERT: A framework for longitudinal study of Hinduphobia on social media via LLMs
- URL: http://arxiv.org/abs/2501.05482v1
- Date: Tue, 07 Jan 2025 23:22:05 GMT
- Title: HP-BERT: A framework for longitudinal study of Hinduphobia on social media via LLMs
- Authors: Ashutosh Singh, Rohitash Chandra,
- Abstract summary: We present an abuse detection and sentiment analysis framework that offers a longitudinal analysis of Hinduphobia on X (Twitter) during and after the COVID-19 pandemic.
This framework assesses the prevalence and intensity of Hinduphobic discourse, capturing elements such as derogatory jokes and racist remarks.
Our study encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom.
- Score: 1.9376226959814953
- License:
- Abstract: During the COVID-19 pandemic, community tensions intensified, fuelling Hinduphobic sentiments and discrimination against individuals of Hindu descent within India and worldwide. Large language models (LLMs) have become prominent in natural language processing (NLP) tasks and social media analysis, enabling longitudinal studies of platforms like X (formerly Twitter) for specific issues during COVID-19. We present an abuse detection and sentiment analysis framework that offers a longitudinal analysis of Hinduphobia on X (Twitter) during and after the COVID-19 pandemic. This framework assesses the prevalence and intensity of Hinduphobic discourse, capturing elements such as derogatory jokes and racist remarks through sentiment analysis and abuse detection from pre-trained and fine-tuned LLMs. Additionally, we curate and publish a "Hinduphobic COVID-19 X (Twitter) Dataset" of 8,000 tweets annotated for Hinduphobic abuse detection, which is used to fine-tune a BERT model, resulting in the development of the Hinduphobic BERT (HP-BERT) model. We then further fine-tune HP-BERT using the SenWave dataset for multi-label sentiment analysis. Our study encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom. Our findings reveal a strong correlation between spikes in COVID-19 cases and surges in Hinduphobic rhetoric, highlighting how political narratives, misinformation, and targeted jokes contributed to communal polarisation. These insights provide valuable guidance for developing strategies to mitigate communal tensions in future crises, both locally and globally. We advocate implementing automated monitoring and removal of such content on social media to curb divisive discourse.
Related papers
- A longitudinal sentiment analysis of Sinophobia during COVID-19 using large language models [3.3741245091336083]
The COVID-19 pandemic has exacerbated xenophobia, particularly Sinophobia, leading to widespread discrimination against individuals of Chinese descent.
We present a sentiment analysis framework utilising LLMs for longitudinal sentiment analysis of the Sinophobic sentiments expressed in X (Twitter) during the COVID-19 pandemic.
The results show a significant correlation between the spikes in Sinophobic tweets, Sinophobic sentiments and surges in COVID-19 cases, revealing that the evolution of the pandemic influenced public sentiment and the prevalence of Sinophobic discourse.
arXiv Detail & Related papers (2024-08-29T23:39:11Z) - Exploring a Hybrid Deep Learning Framework to Automatically Discover
Topic and Sentiment in COVID-19 Tweets [2.3940819037450987]
COVID-19 has created a major public health problem worldwide and other problems such as economic crisis, unemployment, mental distress, etc.
The pandemic is deadly in the world and involves many people not only with infection but also with problems, stress, wonder, fear, resentment, and hatred.
Twitter is a highly influential social media platform and a significant source of health-related information, news, opinion and public sentiment.
arXiv Detail & Related papers (2023-12-02T16:58:17Z) - Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis [44.17106903728264]
Most hate speech datasets neglect the cultural diversity within a single language.
To address this, we introduce CREHate, a CRoss-cultural English Hate speech dataset.
Only 56.2% of the posts in CREHate achieve consensus among all countries, with the highest pairwise label difference rate of 26%.
arXiv Detail & Related papers (2023-08-31T13:14:47Z) - What goes on inside rumour and non-rumour tweets and their reactions: A
Psycholinguistic Analyses [58.75684238003408]
psycho-linguistics analyses of social media text are vital for drawing meaningful conclusions to mitigate misinformation.
This research contributes by performing an in-depth psycholinguistic analysis of rumours related to various kinds of events.
arXiv Detail & Related papers (2021-11-09T07:45:11Z) - When a crisis strikes: Emotion analysis and detection during COVID-19 [96.03869351276478]
We present CovidEmo, 1K tweets labeled with emotions.
We examine how well large pre-trained language models generalize across domains and crises.
arXiv Detail & Related papers (2021-07-23T04:07:14Z) - COVID-19 sentiment analysis via deep learning during the rise of novel
cases [0.5156484100374059]
We use deep learning based language models via long short-term memory (LSTM) recurrent neural networks for sentiment analysis on Twitter.
We find that the majority of the tweets have been positive with high levels of optimism during the rise of the COVID-19 cases in India.
We find that the optimistic and joking tweets mostly dominated the monthly tweets and there was a much lower number of negative sentiments expressed.
arXiv Detail & Related papers (2021-04-05T04:31:19Z) - Country Image in COVID-19 Pandemic: A Case Study of China [79.17323278601869]
Country image has a profound influence on international relations and economic development.
In the worldwide outbreak of COVID-19, countries and their people display different reactions.
In this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset.
arXiv Detail & Related papers (2020-09-12T15:54:51Z) - Analyzing COVID-19 on Online Social Media: Trends, Sentiments and
Emotions [44.92240076313168]
We analyze the affective trajectories of the American people and the Chinese people based on Twitter and Weibo posts between January 20th, 2020 and May 11th 2020.
By contrasting two very different countries, China and the Unites States, we reveal sharp differences in people's views on COVID-19 in different cultures.
Our study provides a computational approach to unveiling public emotions and concerns on the pandemic in real-time, which would potentially help policy-makers better understand people's need and thus make optimal policy.
arXiv Detail & Related papers (2020-05-29T09:24:38Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - #Coronavirus or #Chinesevirus?!: Understanding the negative sentiment
reflected in Tweets with racist hashtags across the development of COVID-19 [1.0878040851638]
We focus on the analysis of negative sentiment reflected in tweets marked with racist hashtags.
We propose a stage-based approach to capture how the negative sentiment changes along with the three development stages of COVID-19.
arXiv Detail & Related papers (2020-05-17T11:15:50Z) - Psychometric Analysis and Coupling of Emotions Between State Bulletins
and Twitter in India during COVID-19 Infodemic [7.428097999824421]
COVID-19 infodemic has been spreading faster than the pandemic itself.
Since social media is the largest source of information, managing the infodemic requires mitigating of misinformation.
Twitter alone has seen a sharp 45% increase in the usage of its curated events page.
arXiv Detail & Related papers (2020-05-12T01:51:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.