Related papers: Masked LARk: Masked Learning, Aggregation and Reporting worKflow

Masked LARk: Masked Learning, Aggregation and Reporting worKflow

URL: http://arxiv.org/abs/2110.14794v1
Date: Wed, 27 Oct 2021 21:59:37 GMT
Title: Masked LARk: Masked Learning, Aggregation and Reporting worKflow
Authors: Joseph J. Pfeiffer III and Denis Charles and Davis Gilton and Young Hun Jung and Mehul Parsana and Erik Anderson
Abstract summary: Many web advertising data flows involve passive cross-site tracking of users. Most browsers are moving towards removal of 3PC in subsequent browser iterations. We propose a new proposal, called Masked LARk, for aggregation of user engagement measurement and model training.
Score: 6.484847460164177
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Today, many web advertising data flows involve passive cross-site tracking of users. Enabling such a mechanism through the usage of third party tracking cookies (3PC) exposes sensitive user data to a large number of parties, with little oversight on how that data can be used. Thus, most browsers are moving towards removal of 3PC in subsequent browser iterations. In order to substantially improve end-user privacy while allowing sites to continue to sustain their business through ad funding, new privacy-preserving primitives need to be introduced. In this paper, we discuss a new proposal, called Masked LARk, for aggregation of user engagement measurement and model training that prevents cross-site tracking, while remaining (a) flexible, for engineering development and maintenance, (b) secure, in the sense that cross-site tracking and tracing are blocked and (c) open for continued model development and training, allowing advertisers to serve relevant ads to interested users. We introduce a secure multi-party compute (MPC) protocol that utilizes "helper" parties to train models, so that once data leaves the browser, no downstream system can individually construct a complete picture of the user activity. For training, our key innovation is through the usage of masking, or the obfuscation of the true labels, while still allowing a gradient to be accurately computed in aggregate over a batch of data. Our protocol only utilizes light cryptography, at such a level that an interested yet inexperienced reader can understand the core algorithm. We develop helper endpoints that implement this system, and give example usage of training in PyTorch.

Related papers

RLSA-PFL: Robust Lightweight Secure Aggregation with Model Inconsistency Detection in Privacy-Preserving Federated Learning [12.804623314091508]
Federated Learning (FL) allows users to collaboratively train a global machine learning model by sharing local model only, without exposing their private data to a central server. Study have revealed privacy vulnerabilities in FL, where adversaries can potentially infer sensitive information from the shared model parameters. We present an efficient masking-based secure aggregation scheme utilizing lightweight cryptographic primitives to privacy risks.
arXiv Detail & Related papers (2025-02-13T06:01:09Z)
FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting [10.671588861099323]
Browser fingerprinting provides an attractive alternative to third-party cookies for tracking users across the web. Previous work proposed several techniques to detect its prevalence and severity. We present FP-Fed, the first distributed system for browser fingerprinting detection.
arXiv Detail & Related papers (2023-11-28T16:43:17Z)
Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z)
PURL: Safe and Effective Sanitization of Link Decoration [20.03929841111819]
We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration. Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage.
arXiv Detail & Related papers (2023-08-07T09:08:39Z)
Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning. By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z)
Privacy-Preserving Online Content Moderation: A Federated Learning Use Case [3.1925030748447747]
Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. We propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP) We show that the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions.
arXiv Detail & Related papers (2022-09-23T20:12:18Z)
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data. We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters. We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z)
Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix [12.678765681171022]
We show that aggregated model updates in federated learning may be insecure. An untrusted central server may disaggregate user updates from sums of updates across participants. Our attack enables the attribution of learned properties to individual users, violating anonymity.
arXiv Detail & Related papers (2021-06-10T23:55:28Z)
Fidel: Reconstructing Private Training Samples from Weight Updates in Federated Learning [0.0]
We evaluate a novel attack method within regular federated learning which we name the First Dense Layer Attack (Fidel) We show how to recover on average twenty out of thirty private data samples from a client's model update employing a fully connected neural network.
arXiv Detail & Related papers (2021-01-01T04:00:23Z)
Federated Learning of User Authentication Models [69.93965074814292]
We propose Federated User Authentication (FedUA), a framework for privacy-preserving training of machine learning models. FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs. We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer.
arXiv Detail & Related papers (2020-07-09T08:04:38Z)
TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation. The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z)
Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction. FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism. It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.