Related papers: Predicting Hateful Discussions on Reddit using Graph Transformer Networks and Communal Context

Predicting Hateful Discussions on Reddit using Graph Transformer Networks and Communal Context

URL: http://arxiv.org/abs/2301.04248v1
Date: Tue, 10 Jan 2023 23:47:13 GMT
Title: Predicting Hateful Discussions on Reddit using Graph Transformer Networks and Communal Context
Authors: Liam Hebert, Lukasz Golab, Robin Cohen
Abstract summary: We propose a system to predict harmful discussions on social media platforms. Our solution uses contextual deep language models and integrates state-of-the-art Graph Transformer Networks. We evaluate our approach on 333,487 Reddit discussions from various communities.
Score: 9.4337569682766
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a system to predict harmful discussions on social media platforms. Our solution uses contextual deep language models and proposes the novel idea of integrating state-of-the-art Graph Transformer Networks to analyze all conversations that follow an initial post. This framework also supports adapting to future comments as the conversation unfolds. In addition, we study whether a community-specific analysis of hate speech leads to more effective detection of hateful discussions. We evaluate our approach on 333,487 Reddit discussions from various communities. We find that community-specific modeling improves performance two-fold and that models which capture wider-discussion context improve accuracy by 28\% (35\% for the most hateful content) compared to limited context models.

Related papers

Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights [10.188075925271471]
Abusive language in social media conversations depends on the conversational context, characterized by the content and topology of preceding comments. Traditional Abusive Language Detection models often overlook this context, which can lead to unreliable performance metrics. Recent Natural Language Processing (NLP) methods that integrate conversational context often depend on limited and simplified representations, and report inconsistent results. We propose a novel approach that utilize graph neural networks (GNNs) to model social media conversations as graphs, where nodes represent comments, and edges capture reply structures.
arXiv Detail & Related papers (2025-04-02T17:03:37Z)
Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics. Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z)
Lexical Squad@Multimodal Hate Speech Event Detection 2023: Multimodal Hate Speech Detection using Fused Ensemble Approach [0.23020018305241333]
We present our novel ensemble learning approach for detecting hate speech, by classifying text-embedded images into two labels, namely "Hate Speech" and "No Hate Speech" Our proposed ensemble model yielded promising results with 75.21 and 74.96 as accuracy and F-1 score (respectively)
arXiv Detail & Related papers (2023-09-23T12:06:05Z)
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media [6.3756400508728515]
We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately.
arXiv Detail & Related papers (2023-07-18T14:57:12Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content [8.393770595114763]
We offer a detailed qualitative analysis of this solution for hate speech detection in social networks. A key insight is that the focus on reasoning about the concept of context positions us well to be able to support multi-modal analysis of online posts. We conclude with a reflection on how the problem we are addressing relates especially well to the theme of dynamic change.
arXiv Detail & Related papers (2023-01-25T23:32:32Z)
Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter. This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document. We also simultaneously cluster users, removing the need for post-hoc cluster estimation. Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z)
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads. We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z)
LIFI: Towards Linguistically Informed Frame Interpolation [66.05105400951567]
We try to solve this problem by using several deep learning video generation algorithms to generate missing frames. We release several datasets to test computer vision video generation models of their speech understanding.
arXiv Detail & Related papers (2020-10-30T05:02:23Z)
Detecting Online Hate Speech: Approaches Using Weak Supervision and Network Embedding Models [2.3322477552758234]
We propose a weak supervision deep learning model that quantitatively uncover hateful users and (ii) present a novel qualitative analysis to uncover indirect hateful conversations. We evaluate our model on 19.2M posts and show that our weak supervision model outperforms the baseline models in identifying indirect hateful interactions. We also analyze a multilayer network, constructed from two types of user interactions in Gab(quote and reply) and interaction scores from the weak supervision model as edge weights, to predict hateful users.
arXiv Detail & Related papers (2020-07-24T18:13:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.