Related papers: Samplable Anonymous Aggregation for Private Federated Data Analysis

Samplable Anonymous Aggregation for Private Federated Data Analysis

URL: http://arxiv.org/abs/2307.15017v2
Date: Fri, 19 Jul 2024 00:18:40 GMT
Title: Samplable Anonymous Aggregation for Private Federated Data Analysis
Authors: Kunal Talwar, Shan Wang, Audra McMillan, Vojta Jina, Vitaly Feldman, Pansy Bansal, Bailey Basile, Aine Cahill, Yi Sheng Chan, Mike Chatzidakis, Junye Chen, Oliver Chick, Mona Chitnis, Suman Ganta, Yusuf Goren, Filip Granqvist, Kristine Guo, Frederic Jacobs, Omid Javidbakht, Albert Liu, Richard Low, Dan Mascenik, Steve Myers, David Park, Wonhee Park, Gianni Parsa, Tommy Pauly, Christian Priebe, Rehan Rishi, Guy Rothblum, Michael Scaria, Linmao Song, Congzheng Song, Karl Tarbe, Sebastian Vogt, Luke Winstrom, Shundong Zhou,
Abstract summary: Locally differentially private algorithms require little trust but are (provably) limited in their utility. Centrally differentially private algorithms can allow significantly better utility but require a trusted curator. Our first contribution is to propose a new primitive that allows for efficient implementation of several commonly used algorithms.
Score: 25.35309084903802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Locally differentially private algorithms require little trust but are (provably) limited in their utility. Centrally differentially private algorithms can allow significantly better utility but require a trusted curator. This gap has led to significant interest in the design and implementation of simple cryptographic primitives, that can allow central-like utility guarantees without having to trust a central server. Our first contribution is to propose a new primitive that allows for efficient implementation of several commonly used algorithms, and allows for privacy accounting that is close to that in the central setting without requiring the strong trust assumptions it entails. {\em Shuffling} and {\em aggregation} primitives that have been proposed in earlier works enable this for some algorithms, but have significant limitations as primitives. We propose a {\em Samplable Anonymous Aggregation} primitive, which computes an aggregate over a random subset of the inputs and show that it leads to better privacy-utility trade-offs for various fundamental tasks. Secondly, we propose a system architecture that implements this primitive and perform a security analysis of the proposed system. Our design combines additive secret-sharing with anonymization and authentication infrastructures.

Related papers

Privacy-aware Berrut Approximated Coded Computing applied to general distributed learning [2.8123958518740544]
This paper considers the use of Private Berrut Approximate Coded Computing (PBACC) as a general solution to add strong but non-perfect privacy to federated learning.<n>We derive new adapted PBACC algorithms for centralized aggregation, secure distributed training with centralized data, and secure decentralized training with decentralized data.
arXiv Detail & Related papers (2025-05-10T21:27:40Z)
Differentially private and decentralized randomized power method [15.955127242261808]
We propose a strategy to reduce the variance of the noise introduced to achieve Differential Privacy (DP) We adapt the method to a decentralized framework with a low computational and communication overhead, while preserving the accuracy. We show that it is possible to use a noise scale in the decentralized setting that is similar to the one in the centralized setting.
arXiv Detail & Related papers (2024-11-04T09:53:03Z)
Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z)
Dynamic Privacy Allocation for Locally Differentially Private Federated Learning with Composite Objectives [10.528569272279999]
This paper proposes a differentially private federated learning algorithm for strongly convex but possibly nonsmooth problems. The proposed algorithm adds artificial noise to the shared information to ensure privacy and dynamically allocates the time-varying noise variance to minimize an upper bound of the optimization error. Numerical results show the superiority of the proposed algorithm over state-of-the-art methods.
arXiv Detail & Related papers (2023-08-02T13:30:33Z)
Private Domain Adaptation from a Public Source [48.83724068578305]
We design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data. Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms.
arXiv Detail & Related papers (2022-08-12T06:52:55Z)
Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient. We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z)
Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets. In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z)
Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing. Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z)
Private measures, random walks, and synthetic data [7.5764890276775665]
Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. We develop a private measure from a data set that allows us to efficiently construct private synthetic data. A key ingredient in our construction is a new superregular random walk, whose joint distribution of steps is as regular as that of independent random variables.
arXiv Detail & Related papers (2022-04-20T00:06:52Z)
Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices. An increasing number of such databases consist of data from multiple sources, not all of which can be trusted. This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z)
Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data. perturbations chosen independently at every agent, resulting in a significant performance loss. We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
Privacy Amplification via Random Check-Ins [38.72327434015975]
Differentially Private Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices (clients) Our main contribution is the emphrandom check-in distributed protocol, which crucially relies only on randomized participation decisions made locally and independently by each client.
arXiv Detail & Related papers (2020-07-13T18:14:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.