Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict
- URL: http://arxiv.org/abs/2306.12886v2
- Date: Sun, 7 Apr 2024 16:01:11 GMT
- Title: Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict
- Authors: Sherzod Hakimov, Gullal S. Cheema,
- Abstract summary: The Russo-Ukrainian conflict has been a subject of intense media coverage worldwide.
We present a novel multimedia dataset that focuses on this topic by collecting and processing tweets posted by news or media companies on social media across the globe.
We collected tweets from February 2022 to May 2023 to acquire approximately 1.5 million tweets in 60 different languages along with their images.
- Score: 5.0337106694127725
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The ongoing Russo-Ukrainian conflict has been a subject of intense media coverage worldwide. Understanding the global narrative surrounding this topic is crucial for researchers that aim to gain insights into its multifaceted dimensions. In this paper, we present a novel multimedia dataset that focuses on this topic by collecting and processing tweets posted by news or media companies on social media across the globe. We collected tweets from February 2022 to May 2023 to acquire approximately 1.5 million tweets in 60 different languages along with their images. Each entry in the dataset is accompanied by processed tags, allowing for the identification of entities, stances, textual or visual concepts, and sentiment. The availability of this multimedia dataset serves as a valuable resource for researchers aiming to investigate the global narrative surrounding the ongoing conflict from various aspects such as who are the prominent entities involved, what stances are taken, where do these stances originate from, how are the different textual and visual concepts related to the event portrayed.
Related papers
- MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval [57.891157692501345]
$textbfMultiVENT 2.0$ is a large-scale, multilingual event-centric video retrieval benchmark.
It features a collection of more than 218,000 news videos and 3,906 queries targeting specific world events.
Preliminary results show that state-of-the-art vision-language models struggle significantly with this task.
arXiv Detail & Related papers (2024-10-15T13:56:34Z) - More than Memes: A Multimodal Topic Modeling Approach to Conspiracy Theories on Telegram [0.0]
We explore the potential of multimodal topic modeling for analyzing conspiracy theories in German-language Telegram channels.
We analyze a corpus of 40, 000 Telegram messages posted in October 2023 in 571 German-language Telegram channels.
arXiv Detail & Related papers (2024-10-11T09:10:26Z) - Narratives at Conflict: Computational Analysis of News Framing in Multilingual Disinformation Campaigns [22.05782818652258]
We explore how multilingual framing of the same issue differs systematically.
We use eight years of Russia-backed disinformation campaigns, spanning 8k news articles in 4 languages targeting 15 countries.
We find that disinformation campaigns consistently and intentionally favor specific framing, depending on the target language of the audience.
arXiv Detail & Related papers (2024-08-24T18:51:47Z) - News and Misinformation Consumption in Europe: A Longitudinal
Cross-Country Perspective [49.1574468325115]
This study investigated information consumption in four European countries.
It analyzed three years of Twitter activity from news outlet accounts in France, Germany, Italy, and the UK.
Results indicate that reliable sources dominate the information landscape, although unreliable content is still present across all countries.
arXiv Detail & Related papers (2023-11-09T16:22:10Z) - Measuring COVID-19 Related Media Consumption on Twitter [2.746705315038595]
Social media platforms have provided essential updates regarding the pandemic.
Online communications with media outlets remain unexplored on an international scale.
This thesis presents the first-of-its-kind study on media consumption on COVID-19 across countries.
arXiv Detail & Related papers (2023-09-16T04:01:45Z) - Bias or Diversity? Unraveling Fine-Grained Thematic Discrepancy in U.S.
News Headlines [63.52264764099532]
We use a large dataset of 1.8 million news headlines from major U.S. media outlets spanning from 2014 to 2022.
We quantify the fine-grained thematic discrepancy related to four prominent topics - domestic politics, economic issues, social issues, and foreign affairs.
Our findings indicate that on domestic politics and social issues, the discrepancy can be attributed to a certain degree of media bias.
arXiv Detail & Related papers (2023-03-28T03:31:37Z) - Automated multilingual detection of Pro-Kremlin propaganda in newspapers
and Telegram posts [5.886782001771578]
The full-scale conflict between the Russian Federation and Ukraine generated an unprecedented amount of news articles and social media data.
This study analyses how the media affected and mirrored public opinion during the first month of the war using news articles and Telegram news channels in Ukrainian, Russian, Romanian and English.
We propose and compare two methods of multilingual automated pro-Kremlin propaganda identification, based on Transformers and linguistic features.
arXiv Detail & Related papers (2023-01-25T14:25:37Z) - Computational Assessment of Hyperpartisanship in News Titles [55.92100606666497]
We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection.
Overall the Right media tends to use proportionally more hyperpartisan titles.
We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles.
arXiv Detail & Related papers (2023-01-16T05:56:58Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - Uncovering the structure of the French media ecosystem [0.0]
We collect data about the production and circulation of online news stories in France over the course of one year.
A block model of the structure shows the systematic rejection of counter-informational press in a separate cluster.
We conclude that the French media ecosystem does not suffer from the same level of polarization as the US media ecosystem.
arXiv Detail & Related papers (2021-07-26T09:51:54Z) - VMSMO: Learning to Generate Multimodal Summary for Video-based News
Articles [63.32111010686954]
We propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO)
The main challenge in this task is to jointly model the temporal dependency of video with semantic meaning of article.
We propose a Dual-Interaction-based Multimodal Summarizer (DIMS), consisting of a dual interaction module and multimodal generator.
arXiv Detail & Related papers (2020-10-12T02:19:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.