Related papers: Private Federated Frequency Estimation: Adapting to the Hardness of the Instance

Private Federated Frequency Estimation: Adapting to the Hardness of the Instance

URL: http://arxiv.org/abs/2306.09396v2
Date: Sat, 2 Dec 2023 18:10:55 GMT
Title: Private Federated Frequency Estimation: Adapting to the Hardness of the Instance
Authors: Jingfeng Wu, Wennan Zhu, Peter Kairouz, Vladimir Braverman
Abstract summary: In federated frequency estimation (FFE), multiple clients work together to estimate the frequencies of their collective data. We show that under the more practical multi-round FEE setting, simple adaptations of count sketching are strictly sub-optimal. We propose a novel hybrid sketching algorithm that is provably more accurate.
Score: 40.518740805553634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In federated frequency estimation (FFE), multiple clients work together to estimate the frequencies of their collective data by communicating with a server that respects the privacy constraints of Secure Summation (SecSum), a cryptographic multi-party computation protocol that ensures that the server can only access the sum of client-held vectors. For single-round FFE, it is known that count sketching is nearly information-theoretically optimal for achieving the fundamental accuracy-communication trade-offs [Chen et al., 2022]. However, we show that under the more practical multi-round FEE setting, simple adaptations of count sketching are strictly sub-optimal, and we propose a novel hybrid sketching algorithm that is provably more accurate. We also address the following fundamental question: how should a practitioner set the sketch size in a way that adapts to the hardness of the underlying problem? We propose a two-phase approach that allows for the use of a smaller sketch size for simpler problems (e.g., near-sparse or light-tailed distributions). We conclude our work by showing how differential privacy can be added to our algorithm and verifying its superior performance through extensive experiments conducted on large-scale datasets.

Related papers

PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
Share Your Secrets for Privacy! Confidential Forecasting with Vertical Federated Learning [5.584904689846748]
Key challenges to address in manufacturing include data privacy and over-fitting on small and noisy datasets. We propose 'Secret-shared Time Series Forecasting with VFL', a novel framework that exhibits the following key features. Our results demonstrate that STV's forecasting accuracy is comparable to those of centralized approaches.
arXiv Detail & Related papers (2024-05-31T12:27:38Z)
Balancing Privacy and Performance for Private Federated Learning Algorithms [4.681076651230371]
Federated learning (FL) is a distributed machine learning framework where multiple clients collaborate to train a model without exposing their private data. FL algorithms frequently employ a differential privacy mechanism that introduces noise into each client's model updates before sharing. We show that an optimal balance exists between the number of local steps and communication rounds, one that maximizes the convergence performance within a given privacy budget.
arXiv Detail & Related papers (2023-04-11T10:42:11Z)
When approximate design for fast homomorphic computation provides differential privacy guarantees [0.08399688944263842]
Differential privacy (DP) and cryptographic primitives are popular countermeasures against privacy attacks. In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator. Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework.
arXiv Detail & Related papers (2023-04-06T09:38:01Z)
On Differential Privacy and Adaptive Data Analysis with Bounded Space [76.10334958368618]
We study the space complexity of the two related fields of differential privacy and adaptive data analysis. We show that there exists a problem P that requires exponentially more space to be solved efficiently with differential privacy. The line of work on adaptive data analysis focuses on understanding the number of samples needed for answering a sequence of adaptive queries.
arXiv Detail & Related papers (2023-02-11T14:45:31Z)
On Differential Privacy for Federated Learning in Wireless Systems with Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference. We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap. Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z)
Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient. We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z)
Differential Secrecy for Distributed Data and Applications to Robust Differentially Secure Vector Summation [32.004283989604154]
We present a protocol for vector summation that verifies that the Euclidean norm of each contribution is approximately bounded. Unlike SMC algorithms that inevitably cast integers to elements of a large finite field, our algorithms work over integers/reals, which may allow for additional efficiencies.
arXiv Detail & Related papers (2022-02-22T02:06:42Z)
A One-Pass Private Sketch for Most Machine Learning Tasks [48.17461258268463]
Differential privacy (DP) is a compelling privacy definition that explains the privacy-utility tradeoff via formal, provable guarantees. We propose a private sketch that supports a multitude of machine learning tasks including regression, classification, density estimation, and more. Our sketch consists of randomized contingency tables that are indexed with locality-sensitive hashing and constructed with an efficient one-pass algorithm.
arXiv Detail & Related papers (2020-06-16T17:47:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.