IsamasRed: A Public Dataset Tracking Reddit Discussions on Israel-Hamas Conflict
- URL: http://arxiv.org/abs/2401.08202v2
- Date: Tue, 16 Apr 2024 20:38:37 GMT
- Title: IsamasRed: A Public Dataset Tracking Reddit Discussions on Israel-Hamas Conflict
- Authors: Kai Chen, Zihao He, Keith Burghardt, Jingxin Zhang, Kristina Lerman,
- Abstract summary: We present a meticulously compiled dataset-IsamasRed spanning from August 2023 to November 2023.
Our initial analysis on the dataset, examining topics, controversy, emotional and moral language trends over time, highlights the emotionally charged and complex nature of the discourse.
This dataset aims to enrich the understanding of online discussions, shedding light on the complex interplay between ideology, sentiment, and community engagement in digital spaces.
- Score: 13.92311040225417
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The conflict between Israel and Palestinians significantly escalated after the October 7, 2023 Hamas attack, capturing global attention. To understand the public discourse on this conflict, we present a meticulously compiled dataset-IsamasRed-comprising nearly 400,000 conversations and over 8 million comments from Reddit, spanning from August 2023 to November 2023. We introduce an innovative keyword extraction framework leveraging a large language model to effectively identify pertinent keywords, ensuring a comprehensive data collection. Our initial analysis on the dataset, examining topics, controversy, emotional and moral language trends over time, highlights the emotionally charged and complex nature of the discourse. This dataset aims to enrich the understanding of online discussions, shedding light on the complex interplay between ideology, sentiment, and community engagement in digital spaces.
Related papers
- Mapping Controversies Using Artificial Intelligence: An Analysis of the Hamas-Israel Conflict on YouTube [2.5357699888548724]
This article analyzes the Hamas-Israel controversy through 253,925 Spanish- YouTube comments posted between October 2023 and January 2024.
Adopting an interdisciplinary approach, the study combines the analysis of controversies from Science and Technology Studies with advanced computational methodologies.
Results show a predominance of pro-Palestinian comments, although pro-Israeli and anti-Palestinian comments received more "likes"
arXiv Detail & Related papers (2025-04-16T15:27:57Z) - Talking Point based Ideological Discourse Analysis in News Events [62.18747509565779]
We propose a framework motivated by the theory of ideological discourse analysis to analyze news articles related to real-world events.
Our framework represents the news articles using a relational structure - talking points, which captures the interaction between entities, their roles, and media frames along with a topic of discussion.
We evaluate our framework's ability to generate these perspectives through automated tasks - ideology and partisan classification tasks, supplemented by human validation.
arXiv Detail & Related papers (2025-04-10T02:52:34Z) - Social media polarization during conflict: Insights from an ideological stance dataset on Israel-Palestine Reddit comments [0.0]
This study analyzed 9,969 Reddit comments related to the Israel-Palestine conflict, collected between October 2023 and August 2024.
Various approaches, including machine learning, pre-trained language models, neural networks, and prompt engineering strategies were employed to classify these stances.
arXiv Detail & Related papers (2025-02-01T12:26:11Z) - Israel-Hamas war through Telegram, Reddit and Twitter [9.020777839880571]
The study will cover an analysis of the related discussion in relation to different participants of the conflict and sentiment represented in those discussion.
We apply a volume analysis across the three datasets, entity extraction and then proceed to BERT topic analysis.
Our findings hint at polarized narratives as the hallmark of how political factions and outsiders mold public opinion.
arXiv Detail & Related papers (2025-01-30T08:20:26Z) - Multi-Platform Aggregated Dataset of Online Communities (MADOC) [64.45797970830233]
MADOC aggregates and standardizes data from Bluesky, Koo, Reddit, and Voat (2012-2024), containing 18.9 million posts, 236 million comments, and 23.1 million unique users.
The dataset enables comparative studies of toxic behavior evolution across platforms through standardized interaction records and sentiment analysis.
arXiv Detail & Related papers (2025-01-22T14:02:11Z) - Quantifying Extreme Opinions on Reddit Amidst the 2023 Israeli-Palestinian Conflict [3.2430260063115224]
This study investigates the dynamics of extreme opinions on social media during the 2023 Israeli-Palestinian conflict.
A lexicon-based, unsupervised methodology was developed to measure "extreme opinions"
The analysis identifies significant peaks in extremism scores that correspond to pivotal real-life events.
arXiv Detail & Related papers (2024-12-14T17:52:28Z) - The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition [64.5207572897806]
The Multimodal Sentiment Analysis Challenge (MuSe) 2024 addresses two contemporary multimodal affect and sentiment analysis problems.
In the Social Perception Sub-Challenge (MuSe-Perception), participants will predict 16 different social attributes of individuals.
The Cross-Cultural Humor Detection Sub-Challenge (MuSe-Humor) dataset expands upon the Passau Spontaneous Football Coach Humor dataset.
arXiv Detail & Related papers (2024-06-11T22:26:20Z) - Coordinated Activity Modulates the Behavior and Emotions of Organic
Users: A Case Study on Tweets about the Gaza Conflict [9.58546889761175]
This research delves into the interaction dynamics between coordinated (malicious) entities and organic (regular) users on Twitter amidst the Gaza conflict.
Through the analysis of approximately 3.5 million tweets from over 1.3 million users, our study uncovers that coordinated users significantly impact the information landscape.
Results highlight the critical need for vigilance and a nuanced understanding of information manipulation on social media platforms.
arXiv Detail & Related papers (2024-02-08T18:07:17Z) - MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection [2.433983268807517]
Hate speech poses significant social, psychological, and occasionally physical threats to targeted individuals and communities.
Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training.
We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate.
Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models.
arXiv Detail & Related papers (2024-01-12T11:54:53Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Uncovering Hidden Connections: Iterative Search and Reasoning for Video-grounded Dialog [83.63849872250651]
Video-grounded dialog requires profound understanding of both dialog history and video content for accurate response generation.
We present an iterative search and reasoning framework, which consists of a textual encoder, a visual encoder, and a generator.
arXiv Detail & Related papers (2023-10-11T07:37:13Z) - Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict [5.0337106694127725]
The Russo-Ukrainian conflict has been a subject of intense media coverage worldwide.
We present a novel multimedia dataset that focuses on this topic by collecting and processing tweets posted by news or media companies on social media across the globe.
We collected tweets from February 2022 to May 2023 to acquire approximately 1.5 million tweets in 60 different languages along with their images.
arXiv Detail & Related papers (2023-06-22T13:52:31Z) - EDSA-Ensemble: an Event Detection Sentiment Analysis Ensemble
Architecture [63.85863519876587]
Using Sentiment Analysis to understand the polarity of each message belonging to an event, as well as the entire event, can help to better understand the general and individual feelings of significant trends and the dynamics on online social networks.
We propose a new ensemble architecture, EDSA-Ensemble, that uses Event Detection and Sentiment Analysis to improve the detection of the polarity for current events from Social Media.
arXiv Detail & Related papers (2023-01-30T11:56:08Z) - Codes, Patterns and Shapes of Contemporary Online Antisemitism and
Conspiracy Narratives -- an Annotation Guide and Labeled German-Language
Dataset in the Context of COVID-19 [0.0]
Antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential.
We develop an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic.
We provide working definitions, including specific forms of antisemitism such as encoded and post-Holocaust antisemitism.
arXiv Detail & Related papers (2022-10-13T10:32:39Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Sentiment Analysis of Political Tweets for Israel using Machine Learning [0.0]
This research proposes an analytical study using Israeli political Twitter data to interpret public opinion towards the Palestinian-Israeli conflict.
The attitudes of ethnic groups and opinion leaders in the form of tweets are analyzed using Machine Learning algorithms.
arXiv Detail & Related papers (2022-04-12T12:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.