Identifying Disinformation Websites Using Infrastructure Features
- URL: http://arxiv.org/abs/2003.07684v5
- Date: Mon, 28 Sep 2020 22:53:39 GMT
- Title: Identifying Disinformation Websites Using Infrastructure Features
- Authors: Austin Hounsel, Jordan Holland, Ben Kaiser, Kevin Borgolte, Nick
Feamster, Jonathan Mayer
- Abstract summary: We explore a new direction for automated detection of disinformation websites: infrastructure features.
Our hypothesis is that while disinformation websites may be perceptually similar to authentic news websites, there may also be significant non-perceptual differences in the domain registrations, TLS/SSL certificates, and web hosting configurations.
- Score: 11.180267856391362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Platforms have struggled to keep pace with the spread of disinformation.
Current responses like user reports, manual analysis, and third-party fact
checking are slow and difficult to scale, and as a result, disinformation can
spread unchecked for some time after being created. Automation is essential for
enabling platforms to respond rapidly to disinformation. In this work, we
explore a new direction for automated detection of disinformation websites:
infrastructure features. Our hypothesis is that while disinformation websites
may be perceptually similar to authentic news websites, there may also be
significant non-perceptual differences in the domain registrations, TLS/SSL
certificates, and web hosting configurations. Infrastructure features are
particularly valuable for detecting disinformation websites because they are
available before content goes live and reaches readers, enabling early
detection. We demonstrate the feasibility of our approach on a large corpus of
labeled website snapshots. We also present results from a preliminary real-time
deployment, successfully discovering disinformation websites while highlighting
unexplored challenges for automated disinformation detection.
Related papers
- Finding Fake News Websites in the Wild [0.0860395700487494]
We propose a novel methodology for identifying websites responsible for creating and disseminating misinformation content.
We validate our approach on Twitter by examining various execution modes and contexts.
arXiv Detail & Related papers (2024-07-09T18:00:12Z) - Harnessing the Power of Text-image Contrastive Models for Automatic
Detection of Online Misinformation [50.46219766161111]
We develop a self-learning model to explore the constrastive learning in the domain of misinformation identification.
Our model shows the superior performance of non-matched image-text pair detection when the training data is insufficient.
arXiv Detail & Related papers (2023-04-19T02:53:59Z) - PANACEA: An Automated Misinformation Detection System on COVID-19 [49.83321665982157]
PANACEA is a web-based misinformation detection system on COVID-19 related claims.
It has two modules, fact-checking and rumour detection.
arXiv Detail & Related papers (2023-02-28T21:53:48Z) - FNDaaS: Content-agnostic Detection of Fake News sites [3.936965297430477]
We propose FND, the first automatic, content-agnostic fake news detection method.
It considers new and unstudied features such as network and structural characteristics per news website.
It can achieve an AUC score of up to 0.967 on past sites, and up to 77-92% accuracy on newly-flagged ones.
arXiv Detail & Related papers (2022-12-13T11:17:32Z) - Explaining Website Reliability by Visualizing Hyperlink Connectivity [18.233714306827736]
MisVis is a web-based interactive visualization tool that helps users assess a website's reliability.
MisVis visualizes the hyperlink connectivity of the website and summarizes key characteristics of the Twitter accounts that mention the site.
A large-scale user study with 139 participants demonstrates that MisVis facilitates users to assess and understand false information on the web.
arXiv Detail & Related papers (2022-10-01T01:39:08Z) - DISCO: Comprehensive and Explainable Disinformation Detection [71.5283511752544]
We propose a comprehensive and explainable disinformation detection framework called DISCO.
We demonstrate DISCO on a real-world fake news detection task with satisfactory detection accuracy and explanation.
We expect that our demo could pave the way for addressing the limitations of identification, comprehension, and explainability as a whole.
arXiv Detail & Related papers (2022-03-09T18:17:25Z) - Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces.
Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z) - Explainable Patterns: Going from Findings to Insights to Support Data
Analytics Democratization [60.18814584837969]
We present Explainable Patterns (ExPatt), a new framework to support lay users in exploring and creating data storytellings.
ExPatt automatically generates plausible explanations for observed or selected findings using an external (textual) source of information.
arXiv Detail & Related papers (2021-01-19T16:13:44Z) - Linked Credibility Reviews for Explainable Misinformation Detection [1.713291434132985]
We propose an architecture based on a core concept of Credibility Reviews (CRs) that can be used to build networks of distributed bots that collaborate for misinformation detection.
CRs serve as building blocks to compose graphs of (i) web content, (ii) existing credibility signals --fact-checked claims and reputation reviews of websites--, and (iii) automatically computed reviews.
We implement this architecture on top of lightweight extensions to.org and services providing generic NLP tasks for semantic similarity and stance detection.
arXiv Detail & Related papers (2020-08-28T16:55:43Z) - Sensitive Information Detection: Recursive Neural Networks for Encoding
Context [0.20305676256390928]
Leak of sensitive information can potentially be very costly.
We show that simplistic, brittle rule sets for detecting sensitive information only find a small fraction of the actual sensitive information.
We develop a novel family of sensitive information detection approaches which only assumes access to labeled examples.
arXiv Detail & Related papers (2020-08-25T07:49:46Z) - Leveraging Multi-Source Weak Social Supervision for Early Detection of
Fake News [67.53424807783414]
Social media has greatly enabled people to participate in online activities at an unprecedented rate.
This unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation.
We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances.
Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
arXiv Detail & Related papers (2020-04-03T18:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.