Semi-supervised Stance Detection of Tweets Via Distant Network
Supervision
- URL: http://arxiv.org/abs/2201.00614v2
- Date: Wed, 5 Jan 2022 07:06:46 GMT
- Title: Semi-supervised Stance Detection of Tweets Via Distant Network
Supervision
- Authors: Subhabrata Dutta, Samiya Caur, Soumen Chakrabarti, Tanmoy Chakraborty
- Abstract summary: Homophily properties over the social network provide strong signal of coarse-grained user-level stance.
We present SANDS, a new semi-supervised stance detector.
Sands achieves a macro-F1 score of 0.55 (0.49) on US (India)-based datasets.
- Score: 32.86421107987556
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting and labeling stance in social media text is strongly motivated by
hate speech detection, poll prediction, engagement forecasting, and concerted
propaganda detection. Today's best neural stance detectors need large volumes
of training data, which is difficult to curate given the fast-changing
landscape of social media text and issues on which users opine. Homophily
properties over the social network provide strong signal of coarse-grained
user-level stance. But semi-supervised approaches for tweet-level stance
detection fail to properly leverage homophily. In light of this, We present
SANDS, a new semi-supervised stance detector. SANDS starts from very few
labeled tweets. It builds multiple deep feature views of tweets. It also uses a
distant supervision signal from the social network to provide a surrogate loss
signal to the component learners. We prepare two new tweet datasets comprising
over 236,000 politically tinted tweets from two demographics (US and India)
posted by over 87,000 users, their follower-followee graph, and over 8,000
tweets annotated by linguists. SANDS achieves a macro-F1 score of 0.55 (0.49)
on US (India)-based datasets, outperforming 17 baselines (including variants of
SANDS) substantially, particularly for minority stance labels and noisy text.
Numerous ablation experiments on SANDS disentangle the dynamics of textual and
network-propagated stance signals.
Related papers
- Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - DoubleH: Twitter User Stance Detection via Bipartite Graph Neural
Networks [9.350629400940493]
We crawl a large-scale dataset of the 2020 US presidential election and automatically label all users by manually tagged hashtags.
We propose a bipartite graph neural network model, DoubleH, which aims to better utilize homogeneous and heterogeneous information in user stance detection tasks.
arXiv Detail & Related papers (2023-01-20T19:20:10Z) - Tweets2Stance: Users stance detection exploiting Zero-Shot Learning
Algorithms on Tweets [0.06372261626436675]
The aim of the study is to predict the stance of a Party p in regard to each statement s exploiting what the Twitter Party account wrote on Twitter.
Results obtained from multiple experiments show that Tweets2Stance can correctly predict the stance with a general minimum MAE of 1.13, which is a great achievement considering the task complexity.
arXiv Detail & Related papers (2022-04-22T14:00:11Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - Combining exogenous and endogenous signals with a semi-supervised
co-attention network for early detection of COVID-19 fake tweets [14.771202995527315]
During COVID-19, tweets with misinformation should be flagged and neutralized in their early stages to mitigate the damages.
Most of the existing methods for early detection of fake news assume to have enough propagation information for large labeled tweets.
We present ENDEMIC, a novel early detection model which leverages endogenous and endogenous signals related to tweets.
arXiv Detail & Related papers (2021-04-12T10:01:44Z) - TweepFake: about Detecting Deepfake Tweets [3.3482093430607254]
Deep neural models can generate coherent, non-trivial and human-like text samples.
Social bots can write plausible deepfake messages, hoping to contaminate public debate.
We collect the first dataset of real deepfake tweets, TweepFake.
arXiv Detail & Related papers (2020-07-31T19:01:13Z) - GCAN: Graph-aware Co-Attention Networks for Explainable Fake News
Detection on Social Media [14.010916616909743]
Given the source short-text tweet and the corresponding sequence of retweet users without text comments, we aim at predicting whether the source tweet is fake or not.
We develop a novel neural network-based model, Graph-aware Co-Attention Networks (GCAN), to achieve the goal.
arXiv Detail & Related papers (2020-04-24T10:42:49Z) - Leveraging Multi-Source Weak Social Supervision for Early Detection of
Fake News [67.53424807783414]
Social media has greatly enabled people to participate in online activities at an unprecedented rate.
This unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation.
We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances.
Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
arXiv Detail & Related papers (2020-04-03T18:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.