Related papers: Quantifying Mix Network Privacy Erosion with Generative Models

Quantifying Mix Network Privacy Erosion with Generative Models

URL: http://arxiv.org/abs/2506.08918v1
Date: Tue, 10 Jun 2025 15:43:39 GMT
Title: Quantifying Mix Network Privacy Erosion with Generative Models
Authors: Vasilios Mavroudis, Tariq Elahi,
Abstract summary: This work uses a generative model trained on mixnet traffic to estimate the loss of privacy when users communicate persistently over a period of time.<n>Our findings reveal notable differences in privacy levels among mix strategies, even when they have similar mean latencies.
Score: 0.3683202928838613
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern mix networks improve over Tor and provide stronger privacy guarantees by robustly obfuscating metadata. As long as a message is routed through at least one honest mixnode, the privacy of the users involved is safeguarded. However, the complexity of the mixing mechanisms makes it difficult to estimate the cumulative privacy erosion occurring over time. This work uses a generative model trained on mixnet traffic to estimate the loss of privacy when users communicate persistently over a period of time. We train our large-language model from scratch on our specialized network traffic ``language'' and then use it to measure the sender-message unlinkability in various settings (e.g. mixing strategies, security parameters, observation window). Our findings reveal notable differences in privacy levels among mix strategies, even when they have similar mean latencies. In comparison, we demonstrate the limitations of traditional privacy metrics, such as entropy and log-likelihood, in fully capturing an adversary's potential to synthesize information from multiple observations. Finally, we show that larger models exhibit greater sample efficiency and superior capabilities implying that further advancements in transformers will consequently enhance the accuracy of model-based privacy estimates.

Related papers

Segmented Private Data Aggregation in the Multi-message Shuffle Model [9.298982907061099]
We pioneer the study of segmented private data aggregation within the multi-message shuffle model of differential privacy.<n>Our framework introduces flexible privacy protection for users and enhanced utility for the aggregation server.<n>Our framework achieves a reduction of about 50% in estimation error compared to existing approaches.
arXiv Detail & Related papers (2024-07-29T01:46:44Z)
PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.<n>The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.<n>We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z)
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack. When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model. Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z)
Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server. Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy. In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z)
Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model using Subspace Perturbation [4.2698418800007865]
federated learning is motivated by the privacy concern as it does not allow to transmit private data but only intermediate updates. We propose a fully decentralized privacy-preserving solution, which is able to securely compute the updates in each step. Numerical validation shows that the proposed approach has superior performance compared to the existing approach in terms of both the accuracy and privacy level.
arXiv Detail & Related papers (2022-09-16T09:58:03Z)
On the Privacy Effect of Data Enhancement via the Lens of Memorization [20.63044895680223]
We propose to investigate privacy from a new perspective called memorization. Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks. We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.
arXiv Detail & Related papers (2022-08-17T13:02:17Z)
You Are What You Write: Preserving Privacy in the Era of Large Language Models [2.3431670397288005]
We present an empirical investigation into the extent of the personal information encoded into pre-trained representations by a range of popular models. We show a positive correlation between the complexity of a model, the amount of data used in pre-training, and data leakage.
arXiv Detail & Related papers (2022-04-20T11:12:53Z)
Just Fine-tune Twice: Selective Differential Privacy for Large Language Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models. Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z)
Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions. We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)
InfoScrub: Towards Attribute Privacy by Targeted Obfuscation [77.49428268918703]
We study techniques that allow individuals to limit the private information leaked in visual data. We tackle this problem in a novel image obfuscation framework. We find our approach generates obfuscated images faithful to the original input images, and additionally increase uncertainty by 6.2$times$ (or up to 0.85 bits) over the non-obfuscated counterparts.
arXiv Detail & Related papers (2020-05-20T19:48:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.