Towards Fine-Grained Webpage Fingerprinting at Scale
- URL: http://arxiv.org/abs/2409.04341v1
- Date: Fri, 6 Sep 2024 15:21:00 GMT
- Title: Towards Fine-Grained Webpage Fingerprinting at Scale
- Authors: Xiyuan Zhao, Xinhao Deng, Qi Li, Yunpeng Liu, Zhuotao Liu, Kun Sun, Ke Xu,
- Abstract summary: Website Fingerprinting (WF) attacks can effectively identify the websites visited by Tor clients via analyzing encrypted traffic patterns.
Existing attacks focus on identifying different websites, but their accuracy dramatically decreases when applied to identify fine-grained webpages.
We propose Oscar, a WPF attack based on multi-label metric learning that identifies different webpages from obfuscated traffic by transforming the feature space.
- Score: 18.201489295361892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Website Fingerprinting (WF) attacks can effectively identify the websites visited by Tor clients via analyzing encrypted traffic patterns. Existing attacks focus on identifying different websites, but their accuracy dramatically decreases when applied to identify fine-grained webpages, especially when distinguishing among different subpages of the same website. WebPage Fingerprinting (WPF) attacks face the challenges of highly similar traffic patterns and a much larger scale of webpages. Furthermore, clients often visit multiple webpages concurrently, increasing the difficulty of extracting the traffic patterns of each webpage from the obfuscated traffic. In this paper, we propose Oscar, a WPF attack based on multi-label metric learning that identifies different webpages from obfuscated traffic by transforming the feature space. Oscar can extract the subtle differences among various webpages, even those with similar traffic patterns. In particular, Oscar combines proxy-based and sample-based metric learning losses to extract webpage features from obfuscated traffic and identify multiple webpages. We prototype Oscar and evaluate its performance using traffic collected from 1,000 monitored webpages and over 9,000 unmonitored webpages in the real world. Oscar demonstrates an 88.6% improvement in the multi-label metric Recall@5 compared to the state-of-the-art attacks.
Related papers
- Unveiling the Digital Fingerprints: Analysis of Internet attacks based on website fingerprints [0.0]
We show that using the newest machine learning algorithms an attacker can deanonymize Tor traffic by applying such techniques.
We capture network packets across 11 days, while users navigate specific web pages, recording data in.pcapng format through the Wireshark network capture tool.
arXiv Detail & Related papers (2024-09-01T18:44:40Z) - Seamless Website Fingerprinting in Multiple Environments [4.226243782049956]
Website fingerprinting (WF) attacks identify the websites visited over anonymized connections.
We introduce a new approach that classifies entire websites rather than individual web pages.
Our Convolutional Neural Network (CNN) uses only the jitter and size of 500 contiguous packets from any point in a TCP stream.
arXiv Detail & Related papers (2024-07-28T02:18:30Z) - EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data.
While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated.
We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z) - AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation [54.17246674188208]
Web scraping is a powerful technique that extracts data from websites, enabling automated data collection, enhancing data analysis capabilities, and minimizing manual data entry efforts.
Existing methods, wrappers-based methods suffer from limited adaptability and scalability when faced with a new website.
We introduce the paradigm of generating web scrapers with large language models (LLMs) and propose AutoScraper, a two-stage framework that can handle diverse and changing web environments more efficiently.
arXiv Detail & Related papers (2024-04-19T09:59:44Z) - PRAT: PRofiling Adversarial aTtacks [52.693011665938734]
We introduce a novel problem of PRofiling Adversarial aTtacks (PRAT)
Given an adversarial example, the objective of PRAT is to identify the attack used to generate it.
We use AID to devise a novel framework for the PRAT objective.
arXiv Detail & Related papers (2023-09-20T07:42:51Z) - PURL: Safe and Effective Sanitization of Link Decoration [20.03929841111819]
We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration.
Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage.
arXiv Detail & Related papers (2023-08-07T09:08:39Z) - Efficient and Low Overhead Website Fingerprinting Attacks and Defenses
based on TCP/IP Traffic [16.6602652644935]
Website fingerprinting attacks based on machine learning and deep learning tend to use the most typical features to achieve a satisfactory performance of attacking rate.
To defend against such attacks, random packet defense (RPD) with a high cost of excessive network overhead is usually applied.
We propose a filter-assisted attack against RPD, which can filter out the injected noises using the statistical characteristics of TCP/IP traffic.
We further improve the list-based defense by a traffic splitting mechanism, which can combat the mentioned attacks as well as save a considerable amount of network overhead.
arXiv Detail & Related papers (2023-02-27T13:45:15Z) - Poisoning Web-Scale Training Datasets is Practical [73.34964403079775]
We introduce two new dataset poisoning attacks that intentionally introduce malicious examples to a model's performance.
First attack, split-view poisoning, exploits the mutable nature of internet content to ensure a dataset annotator's initial view of the dataset differs from the view downloaded by subsequent clients.
Second attack, frontrunning poisoning, targets web-scale datasets that periodically snapshot crowd-sourced content.
arXiv Detail & Related papers (2023-02-20T18:30:54Z) - Detecting Adversarial Examples by Input Transformations, Defense
Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks.
CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output.
This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z) - Adaptive Webpage Fingerprinting from TLS Traces [13.009834690757614]
In webpage fingerprinting, an adversary infers the specific webpage loaded by a victim user by analysing the patterns in the encrypted TLS traffic exchanged between the user's browser and the website's servers.
This work studies modern webpage fingerprinting adversaries against the TLS protocol.
We introduce a TLS-specific model that: 1) scales to an unprecedented number of target webpages, 2) can accurately classify thousands of classes it never encountered during training, and 3) has low operational costs even in scenarios of frequent page updates.
arXiv Detail & Related papers (2020-10-19T15:13:07Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.