VoynaSlov: A Data Set of Russian Social Media Activity during the 2022
Ukraine-Russia War
- URL: http://arxiv.org/abs/2205.12382v1
- Date: Tue, 24 May 2022 21:59:10 GMT
- Title: VoynaSlov: A Data Set of Russian Social Media Activity during the 2022
Ukraine-Russia War
- Authors: Chan Young Park, Julia Mendelsohn, Anjalie Field, Yulia Tsvetkov
- Abstract summary: We describe a new data set called VoynaSlov which contains 21M+ Russian-language social media activities.
We scraped the data from two major platforms that are widely used in Russia: Twitter and VKontakte (VK), a Russian social media platform based in Saint Petersburg commonly referred to as "Russian Facebook"
The main differences that distinguish our data from previously released data related to the ongoing war are its focus on Russian media and consideration of state-affiliation.
- Score: 36.18151945028956
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this report, we describe a new data set called VoynaSlov which contains
21M+ Russian-language social media activities (i.e. tweets, posts, comments)
made by Russian media outlets and by the general public during the time of war
between Ukraine and Russia. We scraped the data from two major platforms that
are widely used in Russia: Twitter and VKontakte (VK), a Russian social media
platform based in Saint Petersburg commonly referred to as "Russian Facebook".
We provide descriptions of our data collection process and data statistics that
compare state-affiliated and independent Russian media, and also the two
platforms, VK and Twitter. The main differences that distinguish our data from
previously released data related to the ongoing war are its focus on Russian
media and consideration of state-affiliation as well as the inclusion of data
from VK, which is more suitable than Twitter for understanding Russian public
sentiment considering its wide use within Russia. We hope our data set can
facilitate future research on information warfare and ultimately enable the
reduction and prevention of disinformation and opinion manipulation campaigns.
The data set is available at https://github.com/chan0park/VoynaSlov and will be
regularly updated as we continuously collect more data.
Related papers
- News and Misinformation Consumption in Europe: A Longitudinal
Cross-Country Perspective [49.1574468325115]
This study investigated information consumption in four European countries.
It analyzed three years of Twitter activity from news outlet accounts in France, Germany, Italy, and the UK.
Results indicate that reliable sources dominate the information landscape, although unreliable content is still present across all countries.
arXiv Detail & Related papers (2023-11-09T16:22:10Z) - Detecting Human Rights Violations on Social Media during Russia-Ukraine
War [1.2599533416395763]
The present-day Russia-Ukraine military conflict has exposed the pivotal role of social media in enabling the transparent and unbridled sharing of information.
Social media platforms have the potential to serve as effective instruments for monitoring and documenting Human Rights Violations (HRV)
Our research focuses on the analysis of data from Telegram, the leading social media platform for reading independent news in post-Soviet regions.
arXiv Detail & Related papers (2023-06-06T12:59:03Z) - Russo-Ukrainian War: Prediction and explanation of Twitter suspension [47.61306219245444]
This study focuses on the Twitter suspension mechanism and the analysis of shared content and features of user accounts that may lead to this.
We have obtained a dataset containing 107.7M tweets, originating from 9.8 million users, using Twitter API.
Our results reveal scam campaigns taking advantage of trending topics regarding the Russia-Ukrainian conflict for Bitcoin fraud, spam, and advertisement campaigns.
arXiv Detail & Related papers (2023-06-06T08:41:02Z) - Happenstance: Utilizing Semantic Search to Track Russian State Media
Narratives about the Russo-Ukrainian War On Reddit [5.567674129101803]
We study Russian state media narratives touted by the Russian government to English-speaking audiences.
We first perform sentence-level topic analysis using the large-language model MPNet on articles published by ten different pro-Russian propaganda websites.
Using MPNet and a semantic search algorithm, we map these subreddits' comments to the set of topics extracted from our set of Russian websites.
arXiv Detail & Related papers (2022-05-28T16:54:53Z) - Twitter Dataset on the Russo-Ukrainian War [68.713984286035]
We have initiated an ongoing dataset acquisition from Twitter API.
The dataset has reached the amount of 57.3 million tweets, originating from 7.7 million users.
We apply an initial volume and sentiment analysis, while the dataset can be used to further exploratory investigation towards topic analysis, hate speech, propaganda recognition, or even show potential malicious entities like botnets.
arXiv Detail & Related papers (2022-04-07T12:33:06Z) - Tweets in Time of Conflict: A Public Dataset Tracking the Twitter
Discourse on the War Between Ukraine and Russia [14.000779544058144]
On February 24, 2022, Russia invaded Ukraine. In the days that followed, reports kept flooding in from layman to news anchors of a conflict quickly escalating into war.
Russia faced immediate backlash and condemnation from the world at large.
While the war continues to contribute to an ongoing humanitarian and refugee crisis in Ukraine, a second battlefield has emerged in the online space.
arXiv Detail & Related papers (2022-03-14T20:52:02Z) - A Weibo Dataset for the 2022 Russo-Ukrainian Crisis [59.258530429699924]
We present the Russia-Ukraine Crisis Weibo dataset, with over 3.5M user posts and comments in the first release.
Our data is available at https://github.com/yrf1/Russia-Ukraine_weibo_dataset.
arXiv Detail & Related papers (2022-03-09T19:06:04Z) - Russian trolls speaking Russian: Regional Twitter operations and MH17 [68.8204255655161]
In 2018, Twitter released data on accounts identified as Russian trolls.
We analyze the Russian-language operations of these trolls.
We find that trolls' information campaign on the MH17 crash was the largest in terms of tweet count.
arXiv Detail & Related papers (2020-05-13T19:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.