Related papers: A machine learning approach for detecting CNAME cloaking-based tracking on the Web

A machine learning approach for detecting CNAME cloaking-based tracking on the Web

URL: http://arxiv.org/abs/2009.14330v1
Date: Tue, 29 Sep 2020 22:33:19 GMT
Title: A machine learning approach for detecting CNAME cloaking-based tracking on the Web
Authors: Ha Dao, Kensuke Fukuda
Abstract summary: We propose a supervised learning-based method to detect machine cloaking-based tracking without the on-demand DNS lookup API. Our goal is to detect both sites and requests linked to cloaking-related tracking. Our evaluation shows that the proposed approach outperforms well-known tracking filter lists.
Score: 2.7267622401439255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Various in-browser privacy protection techniques have been designed to protect end-users from third-party tracking. In an arms race against these counter-measures, the tracking providers developed a new technique called CNAME cloaking based tracking to avoid issues with browsers that block third-party cookies and requests. To detect this tracking technique, browser extensions require on-demand DNS lookup APIs. This feature is however only supported by the Firefox browser. In this paper, we propose a supervised machine learning-based method to detect CNAME cloaking-based tracking without the on-demand DNS lookup. Our goal is to detect both sites and requests linked to CNAME cloaking-related tracking. We crawl a list of target sites and store all HTTP/HTTPS requests with their attributes. Then we label all instances automatically by looking up CNAME record of subdomain, and applying wildcard matching based on well-known tracking filter lists. After extracting features, we build a supervised classification model to distinguish site and request related to CNAME cloaking-based tracking. Our evaluation shows that the proposed approach outperforms well-known tracking filter lists: F1 scores of 0.790 for sites and 0.885 for requests. By analyzing the feature permutation importance, we demonstrate that the number of scripts and the proportion of XMLHttpRequests are discriminative for detecting sites, and the length of URL request is helpful in detecting requests. Finally, we analyze concept drift by using the 2018 dataset to train a model and obtain a reasonable performance on the 2020 dataset for detecting both sites and requests using CNAME cloaking-based tracking.

Related papers

CRATOR: a Dark Web Crawler [1.7224362150588657]
This study proposes a general dark web crawler designed to extract pages handling security protocols, such as captchas. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content.
arXiv Detail & Related papers (2024-05-10T09:39:12Z)
Beyond the Request: Harnessing HTTP Response Headers for Cross-Browser Web Tracker Classification in an Imbalanced Setting [0.0]
This study endeavors to design effective machine learning classifiers for web tracker detection using binarized HTTP response headers. Ten supervised models were trained on Chrome data and tested across all browsers, including a Chrome dataset from a year later. Results demonstrated high accuracy, F1-score, precision, recall, and minimal log-loss error for Chrome and Firefox, but subpar performance on Brave.
arXiv Detail & Related papers (2024-02-02T09:07:09Z)
Detection of Malicious DNS-over-HTTPS Traffic: An Anomaly Detection Approach using Autoencoders [0.0]
We design an autoencoder that is capable of detecting malicious DNS traffic by only observing the encrypted DoH traffic. We find that our proposed autoencoder achieves the highest detection performance, with a median F-1 score of 99% over several types of malicious traffic.
arXiv Detail & Related papers (2023-10-17T15:03:37Z)
The Key to Deobfuscation is Pattern of Life, not Overcoming Encryption [0.7124736158080939]
We present a novel methodology that is effective at deobfuscating sources by synthesizing measurements from key locations along protocol transaction paths. Our approach links online personas with their origin IP addresses based on a Pattern of Life (PoL) analysis. We show that, when monitoring in the correct places on the Internet, DNS over HTTPS (DoH) and DNS over TLS (DoT) can be deobfuscated with up to 100% accuracy.
arXiv Detail & Related papers (2023-10-04T02:34:29Z)
PURL: Safe and Effective Sanitization of Link Decoration [20.03929841111819]
We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration. Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage.
arXiv Detail & Related papers (2023-08-07T09:08:39Z)
OmniTracker: Unifying Object Tracking by Tracking-with-Detection [119.51012668709502]
OmniTracker is presented to resolve all the tracking tasks with a fully shared network architecture, model weights, and inference pipeline. Experiments on 7 tracking datasets, including LaSOT, TrackingNet, DAVIS16-17, MOT17, MOTS20, and YTVIS19, demonstrate that OmniTracker achieves on-par or even better results than both task-specific and unified tracking models.
arXiv Detail & Related papers (2023-03-21T17:59:57Z)
Real-time Online Multi-Object Tracking in Compressed Domain [66.40326768209]
Recent online Multi-Object Tracking (MOT) methods have achieved desirable tracking performance. Inspired by the fact that the adjacent frames are highly relevant and redundant, we divide the frames into key and non-key frames. Our tracker is about 6x faster while maintaining a comparable tracking performance.
arXiv Detail & Related papers (2022-04-05T09:47:24Z)
Tracking by Joint Local and Global Search: A Target-aware Attention based Approach [63.50045332644818]
We propose a novel target-aware attention mechanism (termed TANet) to conduct joint local and global search for robust tracking. Specifically, we extract the features of target object patch and continuous video frames, then we track and feed them into a decoder network to generate target-aware global attention maps. In the tracking procedure, we integrate the target-aware attention with multiple trackers by exploring candidate search regions for robust tracking.
arXiv Detail & Related papers (2021-06-09T06:54:15Z)
Track to Detect and Segment: An Online Multi-Object Tracker [81.15608245513208]
TraDeS is an online joint detection and tracking model, exploiting tracking clues to assist detection end-to-end. TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features.
arXiv Detail & Related papers (2021-03-16T02:34:06Z)
Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking [102.31092931373232]
We propose a simple online model named Chained-Tracker (CTracker), which naturally integrates all the three subtasks into an end-to-end solution. The two major novelties: chained structure and paired attentive regression, make CTracker simple, fast and effective.
arXiv Detail & Related papers (2020-07-29T02:38:49Z)
Tracking by Instance Detection: A Meta-Learning Approach [99.66119903655711]
We propose a principled three-step approach to build a high-performance tracker. We build two trackers, named Retina-MAML and FCOS-MAML, based on two modern detectors RetinaNet and FCOS. Both trackers run in real-time at 40 FPS.
arXiv Detail & Related papers (2020-04-02T05:55:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.