Wide-AdGraph: Detecting Ad Trackers with a Wide Dependency Chain Graph
- URL: http://arxiv.org/abs/2004.14826v2
- Date: Mon, 10 May 2021 11:43:46 GMT
- Title: Wide-AdGraph: Detecting Ad Trackers with a Wide Dependency Chain Graph
- Authors: Amir Hossein Kargaran, Mohammad Sadegh Akhondzadeh, Mohammad Reza
Heidarpour, Mohammad Hossein Manshaei, Kave Salamatian, Masoud Nejad Sattary
- Abstract summary: Websites use third-party ads and tracking services to deliver targeted ads and collect information about users that visit them.
Most of the blocking solutions rely on crowd-sourced filter lists manually maintained by a large community of users.
In this work, we seek to simplify the update of these filter lists by combining different websites through a large-scale graph.
- Score: 0.2761244786307778
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Websites use third-party ads and tracking services to deliver targeted ads
and collect information about users that visit them. These services put users'
privacy at risk, and that is why users' demand for blocking these services is
growing. Most of the blocking solutions rely on crowd-sourced filter lists
manually maintained by a large community of users. In this work, we seek to
simplify the update of these filter lists by combining different websites
through a large-scale graph connecting all resource requests made over a large
set of sites. The features of this graph are extracted and used to train a
machine learning algorithm with the aim of detecting ads and tracking
resources. As our approach combines different information sources, it is more
robust toward evasion techniques that use obfuscation or changing the usage
patterns. We evaluate our work over the Alexa top-10K websites and find its
accuracy to be 96.1% biased and 90.9% unbiased with high precision and recall.
It can also block new ads and tracking services, which would necessitate being
blocked by further crowd-sourced existing filter lists. Moreover, the approach
followed in this paper sheds light on the ecosystem of third-party tracking and
advertising.
Related papers
- Cluster-Aware Attacks on Graph Watermarks [50.19105800063768]
We introduce a cluster-aware threat model in which adversaries apply community-guided modifications to evade detection.
Our results show that cluster-aware attacks can reduce attribution accuracy by up to 80% more than random baselines.
We propose a lightweight embedding enhancement that distributes watermark nodes across graph communities.
arXiv Detail & Related papers (2025-04-24T22:49:28Z) - PURL: Safe and Effective Sanitization of Link Decoration [20.03929841111819]
We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration.
Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage.
arXiv Detail & Related papers (2023-08-07T09:08:39Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Graph Filters for Signal Processing and Machine Learning on Graphs [83.29608206147515]
We provide a comprehensive overview of graph filters, including the different filtering categories, design strategies for each type, and trade-offs between different types of graph filters.
We discuss how to extend graph filters into filter banks and graph neural networks to enhance the representational power.
Our aim is that this article provides a unifying framework for both beginner and experienced researchers, as well as a common understanding.
arXiv Detail & Related papers (2022-11-16T11:56:45Z) - An Adversarial Attack Analysis on Malicious Advertisement URL Detection
Framework [22.259444589459513]
Malicious advertisement URLs pose a security risk since they are the source of cyber-attacks.
Existing malicious URL detection techniques are limited and to handle unseen features as well as generalize to test data.
In this study, we extract a novel set of lexical and web-scrapped features and employ machine learning technique to set up system for fraudulent advertisement URLs detection.
arXiv Detail & Related papers (2022-04-27T20:06:22Z) - Classification of URL bitstreams using Bag of Bytes [3.2204506933585026]
In this paper, we apply a mechanical approach to generate feature vectors from URL strings.
Our approach achieved 23% better accuracy compared to the existing DL-based approach.
arXiv Detail & Related papers (2021-11-11T07:43:45Z) - Masked LARk: Masked Learning, Aggregation and Reporting worKflow [6.484847460164177]
Many web advertising data flows involve passive cross-site tracking of users.
Most browsers are moving towards removal of 3PC in subsequent browser iterations.
We propose a new proposal, called Masked LARk, for aggregation of user engagement measurement and model training.
arXiv Detail & Related papers (2021-10-27T21:59:37Z) - A machine learning approach for detecting CNAME cloaking-based tracking
on the Web [2.7267622401439255]
We propose a supervised learning-based method to detect machine cloaking-based tracking without the on-demand DNS lookup API.
Our goal is to detect both sites and requests linked to cloaking-related tracking.
Our evaluation shows that the proposed approach outperforms well-known tracking filter lists.
arXiv Detail & Related papers (2020-09-29T22:33:19Z) - Similarity Search for Efficient Active Learning and Search of Rare
Concepts [78.5475382904847]
We improve the computational efficiency of active learning and search methods by restricting the candidate pool for labeling to the nearest neighbors of the currently labeled set.
Our approach achieved similar mean average precision and recall as the traditional global approach while reducing the computational cost of selection by up to three orders of magnitude, thus enabling web-scale active learning.
arXiv Detail & Related papers (2020-06-30T19:46:10Z) - Keystroke Biometrics in Response to Fake News Propagation in a Global
Pandemic [77.79066811371978]
This work proposes and analyzes the use of keystroke biometrics for content de-anonymization.
Fake news have become a powerful tool to manipulate public opinion, especially during major events.
arXiv Detail & Related papers (2020-05-15T17:56:11Z) - Stealing Links from Graph Neural Networks [72.85344230133248]
Recently, neural networks were extended to graph data, which are known as graph neural networks (GNNs)
Due to their superior performance, GNNs have many applications, such as healthcare analytics, recommender systems, and fraud detection.
We propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph.
arXiv Detail & Related papers (2020-05-05T13:22:35Z) - Adversarial Attack on Community Detection by Hiding Individuals [68.76889102470203]
We focus on black-box attack and aim to hide targeted individuals from the detection of deep graph community detection models.
We propose an iterative learning framework that takes turns to update two modules: one working as the constrained graph generator and the other as the surrogate community detection model.
arXiv Detail & Related papers (2020-01-22T09:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.