How Many Tweets DoWe Need?: Efficient Mining of Short-Term Polarized
Topics on Twitter: A Case Study From Japan
- URL: http://arxiv.org/abs/2211.16305v1
- Date: Tue, 29 Nov 2022 15:41:30 GMT
- Title: How Many Tweets DoWe Need?: Efficient Mining of Short-Term Polarized
Topics on Twitter: A Case Study From Japan
- Authors: Tomoki Fukuma, Koki Noda, Hiroki Kumagai, Hiroki Yamamoto, Yoshiharu
Ichikawa, Kyosuke Kambe, Yu Maubuchi and Fujio Toriumi
- Abstract summary: We develop a method to identify polarized topics on Twitter in a short-term period, namely 12 hours.
We also develop a prediction method using machine learning techniques to estimate the polarization level using randomly collected tweets.
Our work is the first to predict the polarization level of the topics with low-resource tweets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, social media has been criticized for yielding polarization.
Identifying emerging disagreements and growing polarization is important for
journalists to create alerts and provide more balanced coverage. While recent
studies have shown the existence of polarization on social media, they
primarily focused on limited topics such as politics with a large volume of
data collected in the long term, especially over months or years. While these
findings are helpful, they are too late to create an alert immediately. To
address this gap, we develop a domain-agnostic mining method to identify
polarized topics on Twitter in a short-term period, namely 12 hours. As a
result, we find that daily Japanese news-related topics in early 2022 were
polarized by 31.6\% within a 12-hour range. We also analyzed that they tend to
construct information diffusion networks with a relatively high average degree,
and half of the tweets are created by a relatively small number of people.
However, it is very costly and impractical to collect a large volume of tweets
daily on many topics and monitor the polarization due to the limitations of the
Twitter API. To make it more cost-efficient, we also develop a prediction
method using machine learning techniques to estimate the polarization level
using randomly collected tweets leveraging the network information. Extensive
experiments show a significant saving in collection costs compared to baseline
methods. In particular, our approach achieves F-score of 0.85, requiring 4,000
tweets, 4x savings than the baseline. To the best of our knowledge, our work is
the first to predict the polarization level of the topics with low-resource
tweets. Our findings have profound implications for the news media, allowing
journalists to detect and disseminate polarizing information quickly and
efficiently.
Related papers
- Sampled Datasets Risk Substantial Bias in the Identification of Political Polarization on Social Media [34.192291430580454]
We study the structural polarization of the Polish political debate on Twitter over a 24-hour period.
Large samples can be representative of the whole political discussion on a platform, but small samples consistently fail to accurately reflect the true structure of polarization online.
arXiv Detail & Related papers (2024-06-28T12:13:29Z) - Bias or Diversity? Unraveling Fine-Grained Thematic Discrepancy in U.S.
News Headlines [63.52264764099532]
We use a large dataset of 1.8 million news headlines from major U.S. media outlets spanning from 2014 to 2022.
We quantify the fine-grained thematic discrepancy related to four prominent topics - domestic politics, economic issues, social issues, and foreign affairs.
Our findings indicate that on domestic politics and social issues, the discrepancy can be attributed to a certain degree of media bias.
arXiv Detail & Related papers (2023-03-28T03:31:37Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Adherence to Misinformation on Social Media Through Socio-Cognitive and
Group-Based Processes [79.79659145328856]
We argue that when misinformation proliferates, this happens because the social media environment enables adherence to misinformation.
We make the case that polarization and misinformation adherence are closely tied.
arXiv Detail & Related papers (2022-06-30T12:34:24Z) - NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias [54.89737992911079]
We propose a new task, a neutral summary generation from multiple news headlines of the varying political spectrum.
One of the most interesting observations is that generation models can hallucinate not only factually inaccurate or unverifiable content, but also politically biased content.
arXiv Detail & Related papers (2022-04-11T07:06:01Z) - Reaching the bubble may not be enough: news media role in online
political polarization [58.720142291102135]
A way of reducing polarization would be by distributing cross-partisan news among individuals with distinct political orientations.
This study investigates whether this holds in the context of nationwide elections in Brazil and Canada.
arXiv Detail & Related papers (2021-09-18T11:34:04Z) - Uncovering the structure of the French media ecosystem [0.0]
We collect data about the production and circulation of online news stories in France over the course of one year.
A block model of the structure shows the systematic rejection of counter-informational press in a separate cluster.
We conclude that the French media ecosystem does not suffer from the same level of polarization as the US media ecosystem.
arXiv Detail & Related papers (2021-07-26T09:51:54Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Exploring Polarization of Users Behavior on Twitter During the 2019
South American Protests [15.065938163384235]
We explore polarization on Twitter in a different context, namely the protest that paralyzed several countries in the South American region in 2019.
By leveraging users' endorsement of politicians' tweets and hashtag campaigns with defined stances towards the protest (for or against), we construct a weakly labeled stance dataset with millions of users.
We find empirical evidence of the "filter bubble" phenomenon during the event, as we not only show that the user bases are homogeneous in terms of stance, but the probability that a user transitions from media of different clusters is low.
arXiv Detail & Related papers (2021-04-05T07:13:18Z) - High-level Approaches to Detect Malicious Political Activity on Twitter [0.0]
We investigate a data snapshot taken on May 2020, with around 5 million accounts and over 120 million tweets.
The analyzed time period stretches from August 2019 to May 2020, with a focus on the Portuguese elections of October 6th, 2019.
We learn that Twitter's suspension patterns are not adequate to the type of political trolling found in the Portuguese Twittersphere.
arXiv Detail & Related papers (2021-02-04T22:54:44Z) - Towards A Sentiment Analyzer for Low-Resource Languages [0.0]
This research aims to analyse a sentiment of the users towards a particular trending topic that has been actively and massively discussed at that time.
We use the hashtag textit#kpujangancurang that was the trending topic during the Indonesia presidential election in 2019.
This research utilizes rapid miner tool to generate the twitter data and comparing Naive Bayes, K-Nearest Neighbor, Decision Tree, and Multi-Layer Perceptron classification methods to classify the sentiment of the twitter data.
arXiv Detail & Related papers (2020-11-12T13:50:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.