Topological Data Mapping of Online Hate Speech, Misinformation, and
General Mental Health: A Large Language Model Based Study
- URL: http://arxiv.org/abs/2309.13098v1
- Date: Fri, 22 Sep 2023 15:10:36 GMT
- Title: Topological Data Mapping of Online Hate Speech, Misinformation, and
General Mental Health: A Large Language Model Based Study
- Authors: Andrew Alexander, Hongbin Wang
- Abstract summary: Recent advances in machine learning and large language models have made such an analysis possible.
In this study, we collected thousands of posts from carefully selected communities on the social media site Reddit.
We performed various machine-learning classifications based on embeddings in order to understand the role of hate speech/misinformation in various communities.
- Score: 6.803493330690884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of social media has led to an increased concern over its potential
to propagate hate speech and misinformation, which, in addition to contributing
to prejudice and discrimination, has been suspected of playing a role in
increasing social violence and crimes in the United States. While literature
has shown the existence of an association between posting hate speech and
misinformation online and certain personality traits of posters, the general
relationship and relevance of online hate speech/misinformation in the context
of overall psychological wellbeing of posters remain elusive. One difficulty
lies in the lack of adequate data analytics tools capable of adequately
analyzing the massive amount of social media posts to uncover the underlying
hidden links. Recent progresses in machine learning and large language models
such as ChatGPT have made such an analysis possible. In this study, we
collected thousands of posts from carefully selected communities on the social
media site Reddit. We then utilized OpenAI's GPT3 to derive embeddings of these
posts, which are high-dimensional real-numbered vectors that presumably
represent the hidden semantics of posts. We then performed various
machine-learning classifications based on these embeddings in order to
understand the role of hate speech/misinformation in various communities.
Finally, a topological data analysis (TDA) was applied to the embeddings to
obtain a visual map connecting online hate speech, misinformation, various
psychiatric disorders, and general mental health.
Related papers
- MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection [2.433983268807517]
Hate speech poses significant social, psychological, and occasionally physical threats to targeted individuals and communities.
Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training.
We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate.
Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models.
arXiv Detail & Related papers (2024-01-12T11:54:53Z) - Depression detection in social media posts using affective and social
norm features [84.12658971655253]
We propose a deep architecture for depression detection from social media posts.
We incorporate profanity and morality features of posts and words in our architecture using a late fusion scheme.
The inclusion of the proposed features yields state-of-the-art results in both settings.
arXiv Detail & Related papers (2023-03-24T21:26:27Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter.
This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z) - Adherence to Misinformation on Social Media Through Socio-Cognitive and
Group-Based Processes [79.79659145328856]
We argue that when misinformation proliferates, this happens because the social media environment enables adherence to misinformation.
We make the case that polarization and misinformation adherence are closely tied.
arXiv Detail & Related papers (2022-06-30T12:34:24Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Analysis of Online Toxicity Detection Using Machine Learning Approaches [6.548580592686076]
Social media and the internet have become an integral part of how people spread and consume information.
Almost half of the population is using social media to express their views and opinions.
Online hate speech is one of the drawbacks of social media nowadays, which needs to be controlled.
arXiv Detail & Related papers (2021-04-23T04:29:13Z) - DeepHate: Hate Speech Detection via Multi-Faceted Text Representations [8.192671048046687]
DeepHate is a novel deep learning model that combines multi-faceted text representations such as word embeddings, sentiments, and topical information.
We conduct extensive experiments and evaluate DeepHate on three large publicly available real-world datasets.
arXiv Detail & Related papers (2021-03-14T16:11:30Z) - Investigating Deep Learning Approaches for Hate Speech Detection in
Social Media [20.974715256618754]
The misuse of freedom of expression has led to the increase of various cyber crimes and anti-social activities.
Hate speech is one such issue that needs to be addressed very seriously as otherwise, this could pose threats to the integrity of the social fabrics.
In this paper, we proposed deep learning approaches utilizing various embeddings for detecting various types of hate speeches in social media.
arXiv Detail & Related papers (2020-05-29T17:28:46Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - Echo Chambers on Social Media: A comparative analysis [64.2256216637683]
We introduce an operational definition of echo chambers and perform a massive comparative analysis on 1B pieces of contents produced by 1M users on four social media platforms.
We infer the leaning of users about controversial topics and reconstruct their interaction networks by analyzing different features.
We find support for the hypothesis that platforms implementing news feed algorithms like Facebook may elicit the emergence of echo-chambers.
arXiv Detail & Related papers (2020-04-20T20:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.