Tell Me Who Your Friends Are: Using Content Sharing Behavior for News
Source Veracity Detection
- URL: http://arxiv.org/abs/2101.10973v1
- Date: Fri, 15 Jan 2021 21:39:51 GMT
- Title: Tell Me Who Your Friends Are: Using Content Sharing Behavior for News
Source Veracity Detection
- Authors: Maur\'icio Gruppi, Benjamin D. Horne, Sibel Adal{\i}
- Abstract summary: We propose a novel and robust news veracity detection model that uses the content sharing behavior of news sources formulated as a network.
We show that state of the art writing style and CSN features make diverse mistakes when predicting, meaning that they both play different roles in the classification task.
- Score: 3.359647717705252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stopping the malicious spread and production of false and misleading news has
become a top priority for researchers. Due to this prevalence, many automated
methods for detecting low quality information have been introduced. The
majority of these methods have used article-level features, such as their
writing style, to detect veracity. While writing style models have been shown
to work well in lab-settings, there are concerns of generalizability and
robustness. In this paper, we begin to address these concerns by proposing a
novel and robust news veracity detection model that uses the content sharing
behavior of news sources formulated as a network. We represent these content
sharing networks (CSN) using a deep walk based method for embedding graphs that
accounts for similarity in both the network space and the article text space.
We show that state of the art writing style and CSN features make diverse
mistakes when predicting, meaning that they both play different roles in the
classification task. Moreover, we show that the addition of CSN features
increases the accuracy of writing style models, boosting accuracy as much as
14\% when using Random Forests. Similarly, we show that the combination of
hand-crafted article-level features and CSN features is robust to concept
drift, performing consistently well over a 10-month time frame.
Related papers
- Classification of Non-native Handwritten Characters Using Convolutional Neural Network [0.0]
The classification of English characters written by non-native users is performed by proposing a custom-tailored CNN model.
We train this CNN with a new dataset called the handwritten isolated English character dataset.
The proposed model with five convolutional layers and one hidden layer outperforms state-of-the-art models in terms of character recognition accuracy.
arXiv Detail & Related papers (2024-06-06T21:08:07Z) - Detection and Discovery of Misinformation Sources using Attributed Webgraphs [3.659498819753633]
We introduce a novel attributed webgraph dataset with labeled news domains and their connections to outlinking and backlinking domains.
We demonstrate the success of graph neural networks in detecting news site reliability using these attributed webgraphs.
We also introduce and evaluate a novel graph-based algorithm for discovering previously unknown misinformation news sources.
arXiv Detail & Related papers (2024-01-04T17:47:36Z) - SCStory: Self-supervised and Continual Online Story Discovery [53.72745249384159]
SCStory helps people digest rapidly published news article streams in real-time without human annotations.
SCStory employs self-supervised and continual learning with a novel idea of story-indicative adaptive modeling of news article streams.
arXiv Detail & Related papers (2023-11-27T04:50:01Z) - Fast and Accurate Factual Inconsistency Detection Over Long Documents [19.86348214462828]
We introduce SCALE, a task-agnostic model for detecting factual inconsistencies using a novel chunking strategy.
This approach achieves state-of-the-art performance in factual inconsistency detection for diverse tasks and long inputs.
We have released our code and data publicly to GitHub.
arXiv Detail & Related papers (2023-10-19T22:55:39Z) - Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News
Detection [50.07850264495737]
"Prompt-and-Align" (P&A) is a novel prompt-based paradigm for few-shot fake news detection.
We show that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.
arXiv Detail & Related papers (2023-09-28T13:19:43Z) - Unsupervised Story Discovery from Continuous News Streams via Scalable
Thematic Embedding [37.62597275581973]
Unsupervised discovery of stories with correlated news articles in real-time helps people digest massive news streams without expensive human annotations.
We propose a novel thematic embedding with an off-the-shelf pretrained sentence encoder to dynamically represent articles and stories.
A thorough evaluation with real news data sets demonstrates that USTORY achieves higher story discovery performances than baselines.
arXiv Detail & Related papers (2023-04-08T20:41:15Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Like a Good Nearest Neighbor: Practical Content Moderation and Text
Classification [66.02091763340094]
Like a Good Nearest Neighbor (LaGoNN) is a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor.
LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit.
arXiv Detail & Related papers (2023-02-17T15:43:29Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z) - Informative Dropout for Robust Representation Learning: A Shape-bias
Perspective [84.30946377024297]
We propose a light-weight model-agnostic method, namely Informative Dropout (InfoDrop), to improve interpretability and reduce texture bias.
Specifically, we discriminate texture from shape based on local self-information in an image, and adopt a Dropout-like algorithm to decorrelate the model output from the local texture.
arXiv Detail & Related papers (2020-08-10T16:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.