Masked LARk: Masked Learning, Aggregation and Reporting worKflow
- URL: http://arxiv.org/abs/2110.14794v1
- Date: Wed, 27 Oct 2021 21:59:37 GMT
- Title: Masked LARk: Masked Learning, Aggregation and Reporting worKflow
- Authors: Joseph J. Pfeiffer III and Denis Charles and Davis Gilton and Young
Hun Jung and Mehul Parsana and Erik Anderson
- Abstract summary: Many web advertising data flows involve passive cross-site tracking of users.
Most browsers are moving towards removal of 3PC in subsequent browser iterations.
We propose a new proposal, called Masked LARk, for aggregation of user engagement measurement and model training.
- Score: 6.484847460164177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Today, many web advertising data flows involve passive cross-site tracking of
users. Enabling such a mechanism through the usage of third party tracking
cookies (3PC) exposes sensitive user data to a large number of parties, with
little oversight on how that data can be used. Thus, most browsers are moving
towards removal of 3PC in subsequent browser iterations. In order to
substantially improve end-user privacy while allowing sites to continue to
sustain their business through ad funding, new privacy-preserving primitives
need to be introduced.
In this paper, we discuss a new proposal, called Masked LARk, for aggregation
of user engagement measurement and model training that prevents cross-site
tracking, while remaining (a) flexible, for engineering development and
maintenance, (b) secure, in the sense that cross-site tracking and tracing are
blocked and (c) open for continued model development and training, allowing
advertisers to serve relevant ads to interested users. We introduce a secure
multi-party compute (MPC) protocol that utilizes "helper" parties to train
models, so that once data leaves the browser, no downstream system can
individually construct a complete picture of the user activity. For training,
our key innovation is through the usage of masking, or the obfuscation of the
true labels, while still allowing a gradient to be accurately computed in
aggregate over a batch of data. Our protocol only utilizes light cryptography,
at such a level that an interested yet inexperienced reader can understand the
core algorithm. We develop helper endpoints that implement this system, and
give example usage of training in PyTorch.
Related papers
- FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting [10.671588861099323]
Browser fingerprinting provides an attractive alternative to third-party cookies for tracking users across the web.
Previous work proposed several techniques to detect its prevalence and severity.
We present FP-Fed, the first distributed system for browser fingerprinting detection.
arXiv Detail & Related papers (2023-11-28T16:43:17Z) - Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information.
For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees.
We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z) - PURL: Safe and Effective Sanitization of Link Decoration [20.03929841111819]
We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration.
Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage.
arXiv Detail & Related papers (2023-08-07T09:08:39Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Privacy-Preserving Online Content Moderation: A Federated Learning Use
Case [3.1925030748447747]
Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices.
We propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP)
We show that the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions.
arXiv Detail & Related papers (2022-09-23T20:12:18Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - Gradient Disaggregation: Breaking Privacy in Federated Learning by
Reconstructing the User Participant Matrix [12.678765681171022]
We show that aggregated model updates in federated learning may be insecure.
An untrusted central server may disaggregate user updates from sums of updates across participants.
Our attack enables the attribution of learned properties to individual users, violating anonymity.
arXiv Detail & Related papers (2021-06-10T23:55:28Z) - Fidel: Reconstructing Private Training Samples from Weight Updates in
Federated Learning [0.0]
We evaluate a novel attack method within regular federated learning which we name the First Dense Layer Attack (Fidel)
We show how to recover on average twenty out of thirty private data samples from a client's model update employing a fully connected neural network.
arXiv Detail & Related papers (2021-01-01T04:00:23Z) - Federated Learning of User Authentication Models [69.93965074814292]
We propose Federated User Authentication (FedUA), a framework for privacy-preserving training of machine learning models.
FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs.
We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer.
arXiv Detail & Related papers (2020-07-09T08:04:38Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z) - Privacy-preserving Traffic Flow Prediction: A Federated Learning
Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction.
FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism.
It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.