HP-BERT: A framework for longitudinal study of Hinduphobia on social media via language models
- URL: http://arxiv.org/abs/2501.05482v2
- Date: Sun, 05 Oct 2025 10:40:38 GMT
- Title: HP-BERT: A framework for longitudinal study of Hinduphobia on social media via language models
- Authors: Ashutosh Singh, Rohitash Chandra,
- Abstract summary: We present a computational framework for analyzing anti-Hindu sentiment (Hinduphobia) during the COVID-19 period.<n>We develop the Hinduphobic BERT (HP-BERT) model using this dataset and achieve 94.72% accuracy.<n>This study provides evidence of social media-based religious discrimination during a COVID-19 crisis.
- Score: 6.261384274136677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: During the COVID-19 pandemic, community tensions intensified, contributing to discriminatory sentiments against various religious groups, including Hindu communities. Recent advances in language models have shown promise for social media analysis with potential for longitudinal studies of social media platforms, such as X (Twitter). We present a computational framework for analyzing anti-Hindu sentiment (Hinduphobia) during the COVID-19 period, introducing an abuse detection and sentiment analysis approach for longitudinal analysis on X. We curate and release a "Hinduphobic COVID-19 XDataset" containing 8,000 annotated and manually verified tweets. We then develop the Hinduphobic BERT (HP-BERT) model using this dataset and achieve 94.72\% accuracy, outperforming baseline Transformer-based language models. The model incorporates multi-label sentiment analysis capabilities through additional fine-tuning. Our analysis encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom. Statistical analysis reveals moderate correlations (r = 0.312-0.428) between COVID-19 case increases and Hinduphobic content volume, highlighting how pandemic-related stress may contribute to discriminatory discourse. This study provides evidence of social media-based religious discrimination during a COVID-19 crisis.
Related papers
- The Shifting Landscape of Vaccine Discourse: Insights From a Decade of Pre- to Post-COVID-19 Vaccine Posts on Social Media [61.575555311964706]
We analyze how English speakers talk about vaccines on social media to understand the evolving narrative around vaccines in social media posts.<n>We first introduce a novel dataset comprising 18.7 million curated posts on vaccine discourse from 2013 to 2022.<n>Our analysis shows that the COVID-19 pandemic led to complex shifts in X users' sentiment and discourse around vaccines.
arXiv Detail & Related papers (2025-11-20T22:28:59Z) - SenWave: A Fine-Grained Multi-Language Sentiment Analysis Dataset Sourced from COVID-19 Tweets [42.98177831933239]
SenWave is a novel fine-grained multi-language sentiment analysis dataset specifically designed for analyzing COVID-19 tweets.<n>The dataset comprises 10,000 annotated tweets each in English and Arabic, along with 30,000 translated tweets in Spanish, French, and Italian, derived from English tweets.<n>Our study provides an in-depth analysis of the evolving emotional landscape across languages, countries, and topics, revealing significant insights over time.
arXiv Detail & Related papers (2025-10-09T13:38:05Z) - A longitudinal sentiment analysis of Sinophobia during COVID-19 using large language models [3.3741245091336083]
The COVID-19 pandemic has exacerbated xenophobia, particularly Sinophobia, leading to widespread discrimination against individuals of Chinese descent.
We present a sentiment analysis framework utilising LLMs for longitudinal sentiment analysis of the Sinophobic sentiments expressed in X (Twitter) during the COVID-19 pandemic.
The results show a significant correlation between the spikes in Sinophobic tweets, Sinophobic sentiments and surges in COVID-19 cases, revealing that the evolution of the pandemic influenced public sentiment and the prevalence of Sinophobic discourse.
arXiv Detail & Related papers (2024-08-29T23:39:11Z) - Exploring a Hybrid Deep Learning Framework to Automatically Discover
Topic and Sentiment in COVID-19 Tweets [2.3940819037450987]
COVID-19 has created a major public health problem worldwide and other problems such as economic crisis, unemployment, mental distress, etc.
The pandemic is deadly in the world and involves many people not only with infection but also with problems, stress, wonder, fear, resentment, and hatred.
Twitter is a highly influential social media platform and a significant source of health-related information, news, opinion and public sentiment.
arXiv Detail & Related papers (2023-12-02T16:58:17Z) - Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis [44.17106903728264]
Most hate speech datasets neglect the cultural diversity within a single language.
To address this, we introduce CREHate, a CRoss-cultural English Hate speech dataset.
Only 56.2% of the posts in CREHate achieve consensus among all countries, with the highest pairwise label difference rate of 26%.
arXiv Detail & Related papers (2023-08-31T13:14:47Z) - Sensing the Pulse of the Pandemic: Geovisualizing the Demographic
Disparities of Public Sentiment toward COVID-19 through Social Media [9.906180010952406]
Social media use varies across demographics, with younger users being more prevalent compared to older populations.
This study explores solutions and the demographic biases in social media analysis through a case study estimating public sentiment about COVID-19 using Twitter data.
arXiv Detail & Related papers (2023-03-17T02:59:46Z) - What goes on inside rumour and non-rumour tweets and their reactions: A
Psycholinguistic Analyses [58.75684238003408]
psycho-linguistics analyses of social media text are vital for drawing meaningful conclusions to mitigate misinformation.
This research contributes by performing an in-depth psycholinguistic analysis of rumours related to various kinds of events.
arXiv Detail & Related papers (2021-11-09T07:45:11Z) - AraCOVID19-SSD: Arabic COVID-19 Sentiment and Sarcasm Detection Dataset [0.0]
This paper builds and releases AraCOVID19-SSD a manually annotated Arabic COVID-19 sarcasm and sentiment detection dataset containing 5,162 tweets.
A lot of these users often employ sarcasm to convey their intended meaning in a humorous, funny, and indirect way making it hard for computer-based applications to automatically understand and identify their goal and the harm level that they can inflect.
arXiv Detail & Related papers (2021-10-05T11:24:24Z) - American Twitter Users Revealed Social Determinants-related Oral Health
Disparities amid the COVID-19 Pandemic [72.44305630014534]
We collected oral health-related tweets during the COVID-19 pandemic from 9,104 Twitter users across 26 states.
Women and younger adults (19-29) are more likely to talk about oral health problems.
People from counties at a higher risk of COVID-19 talk more about tooth decay/gum bleeding and chipped tooth/tooth break.
arXiv Detail & Related papers (2021-09-16T01:10:06Z) - When a crisis strikes: Emotion analysis and detection during COVID-19 [96.03869351276478]
We present CovidEmo, 1K tweets labeled with emotions.
We examine how well large pre-trained language models generalize across domains and crises.
arXiv Detail & Related papers (2021-07-23T04:07:14Z) - COVID-19 sentiment analysis via deep learning during the rise of novel
cases [0.5156484100374059]
We use deep learning based language models via long short-term memory (LSTM) recurrent neural networks for sentiment analysis on Twitter.
We find that the majority of the tweets have been positive with high levels of optimism during the rise of the COVID-19 cases in India.
We find that the optimistic and joking tweets mostly dominated the monthly tweets and there was a much lower number of negative sentiments expressed.
arXiv Detail & Related papers (2021-04-05T04:31:19Z) - COVID-19 Tweets Analysis through Transformer Language Models [0.0]
In this study, we perform an in-depth, fine-grained sentiment analysis of tweets in COVID-19.
A trained transformer model is able to correctly predict, with high accuracy, the tone of a tweet.
We then leverage this model for predicting tones for 200,000 tweets on COVID-19.
arXiv Detail & Related papers (2021-02-27T12:06:33Z) - Country Image in COVID-19 Pandemic: A Case Study of China [79.17323278601869]
Country image has a profound influence on international relations and economic development.
In the worldwide outbreak of COVID-19, countries and their people display different reactions.
In this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset.
arXiv Detail & Related papers (2020-09-12T15:54:51Z) - On Analyzing Antisocial Behaviors Amid COVID-19 Pandemic [5.900114841365645]
Despite the gravity of the issue, very few studies have studied online antisocial behaviors amid the COVID-19 pandemic.
In this paper, we fill the research gap by collecting and annotating a large dataset of over 40 million COVID-19 related tweets.
We also conduct an empirical analysis of our annotated dataset and found that new abusive lexicons are introduced amid the COVID-19 pandemic.
arXiv Detail & Related papers (2020-07-21T11:11:35Z) - TICO-19: the Translation Initiative for Covid-19 [112.5601530395345]
The Translation Initiative for COvid-19 (TICO-19) has made test and development data available to AI and MT researchers in 35 different languages.
The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set.
arXiv Detail & Related papers (2020-07-03T16:26:17Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z) - Analyzing COVID-19 on Online Social Media: Trends, Sentiments and
Emotions [44.92240076313168]
We analyze the affective trajectories of the American people and the Chinese people based on Twitter and Weibo posts between January 20th, 2020 and May 11th 2020.
By contrasting two very different countries, China and the Unites States, we reveal sharp differences in people's views on COVID-19 in different cultures.
Our study provides a computational approach to unveiling public emotions and concerns on the pandemic in real-time, which would potentially help policy-makers better understand people's need and thus make optimal policy.
arXiv Detail & Related papers (2020-05-29T09:24:38Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - #Coronavirus or #Chinesevirus?!: Understanding the negative sentiment
reflected in Tweets with racist hashtags across the development of COVID-19 [1.0878040851638]
We focus on the analysis of negative sentiment reflected in tweets marked with racist hashtags.
We propose a stage-based approach to capture how the negative sentiment changes along with the three development stages of COVID-19.
arXiv Detail & Related papers (2020-05-17T11:15:50Z) - Psychometric Analysis and Coupling of Emotions Between State Bulletins
and Twitter in India during COVID-19 Infodemic [7.428097999824421]
COVID-19 infodemic has been spreading faster than the pandemic itself.
Since social media is the largest source of information, managing the infodemic requires mitigating of misinformation.
Twitter alone has seen a sharp 45% increase in the usage of its curated events page.
arXiv Detail & Related papers (2020-05-12T01:51:07Z) - Fighting the COVID-19 Infodemic: Modeling the Perspective of
Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the
Society [37.9389191670008]
COVID-19 has been declared one of the most important focus areas of the World Health Organization.
Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization.
We release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis.
arXiv Detail & Related papers (2020-04-30T18:04:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.