Combining exogenous and endogenous signals with a semi-supervised
co-attention network for early detection of COVID-19 fake tweets
- URL: http://arxiv.org/abs/2104.05321v1
- Date: Mon, 12 Apr 2021 10:01:44 GMT
- Title: Combining exogenous and endogenous signals with a semi-supervised
co-attention network for early detection of COVID-19 fake tweets
- Authors: Rachit Bansal, William Scott Paka, Nidhi, Shubhashis Sengupta, Tanmoy
Chakraborty
- Abstract summary: During COVID-19, tweets with misinformation should be flagged and neutralized in their early stages to mitigate the damages.
Most of the existing methods for early detection of fake news assume to have enough propagation information for large labeled tweets.
We present ENDEMIC, a novel early detection model which leverages endogenous and endogenous signals related to tweets.
- Score: 14.771202995527315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fake tweets are observed to be ever-increasing, demanding immediate
countermeasures to combat their spread. During COVID-19, tweets with
misinformation should be flagged and neutralized in their early stages to
mitigate the damages. Most of the existing methods for early detection of fake
news assume to have enough propagation information for large labeled tweets --
which may not be an ideal setting for cases like COVID-19 where both aspects
are largely absent. In this work, we present ENDEMIC, a novel early detection
model which leverages exogenous and endogenous signals related to tweets, while
learning on limited labeled data. We first develop a novel dataset, called CTF
for early COVID-19 Twitter fake news, with additional behavioral test sets to
validate early detection. We build a heterogeneous graph with
follower-followee, user-tweet, and tweet-retweet connections and train a graph
embedding model to aggregate propagation information. Graph embeddings and
contextual features constitute endogenous, while time-relative web-scraped
information constitutes exogenous signals. ENDEMIC is trained in a
semi-supervised fashion, overcoming the challenge of limited labeled data. We
propose a co-attention mechanism to fuse signal representations optimally.
Experimental results on ECTF, PolitiFact, and GossipCop show that ENDEMIC is
highly reliable in detecting early fake tweets, outperforming nine
state-of-the-art methods significantly.
Related papers
- Enhancing Fake News Detection in Social Media via Label Propagation on Cross-modal Tweet Graph [19.409935976725446]
We present a novel method for detecting fake news in social media.
Our method densifies the graph's connectivity to capture denser interaction better.
We use three publicly available fake news datasets, Twitter, PHEME, and Weibo, for evaluation.
arXiv Detail & Related papers (2024-06-14T09:55:54Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Big data analysis and distributed deep learning for next-generation
intrusion detection system optimization [0.0]
This paper proposes a solution to detect new threats with higher detection rate and lower false positive than already used IDS.
We achieve those results by using Networking, a deep recurrent neural network: Long Short Term Memory (LSTM) on top of Apache Spark Framework.
We propose a model that describes the network abstract normal behavior from a sequence of millions of packets within their context and analyzes them in near real-time to detect point, collective and contextual anomalies.
arXiv Detail & Related papers (2022-09-28T09:46:16Z) - Machine Learning-based Automatic Annotation and Detection of COVID-19
Fake News [8.020736472947581]
COVID-19 impacted every part of the world, although the misinformation about the outbreak traveled faster than the virus.
Existing work neglects the presence of bots that act as a catalyst in the spread.
We propose an automated approach for labeling data using verified fact-checked statements on a Twitter dataset.
arXiv Detail & Related papers (2022-09-07T13:55:59Z) - Mining Fine-grained Semantics via Graph Neural Networks for
Evidence-based Fake News Detection [20.282527436527765]
We propose a unified Graph-based sEmantic sTructure mining framework, namely GET in short.
We model claims and evidences as graph-structured data and capture the long-distance semantic dependency.
After obtaining contextual semantic information, our model reduces information redundancy by performing graph structure learning.
arXiv Detail & Related papers (2022-01-18T11:28:36Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - Deep Fraud Detection on Non-attributed Graph [61.636677596161235]
Graph Neural Networks (GNNs) have shown solid performance on fraud detection.
labeled data is scarce in large-scale industrial problems, especially for fraud detection.
We propose a novel graph pre-training strategy to leverage more unlabeled data.
arXiv Detail & Related papers (2021-10-04T03:42:09Z) - Graph-based Joint Pandemic Concern and Relation Extraction on Twitter [19.7176519744206]
Public concern detection provides potential guidance to the authorities for crisis management before or during a pandemic outbreak.
detecting concerns in time from massive information in social media turns out to be a big challenge.
We propose a novel end-to-end deep learning model to identify people's concerns and the corresponding relations.
arXiv Detail & Related papers (2021-06-18T06:06:35Z) - Anomaly Detection-Based Unknown Face Presentation Attack Detection [74.4918294453537]
Anomaly detection-based spoof attack detection is a recent development in face Presentation Attack Detection.
In this paper, we present a deep-learning solution for anomaly detection-based spoof attack detection.
The proposed approach benefits from the representation learning power of the CNNs and learns better features for fPAD task.
arXiv Detail & Related papers (2020-07-11T21:20:55Z) - Leveraging Multi-Source Weak Social Supervision for Early Detection of
Fake News [67.53424807783414]
Social media has greatly enabled people to participate in online activities at an unprecedented rate.
This unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation.
We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances.
Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
arXiv Detail & Related papers (2020-04-03T18:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.