Related papers: Private and Communication-Efficient Algorithms for Entropy Estimation

Private and Communication-Efficient Algorithms for Entropy Estimation

URL: http://arxiv.org/abs/2305.07751v1
Date: Fri, 12 May 2023 20:35:10 GMT
Title: Private and Communication-Efficient Algorithms for Entropy Estimation
Authors: Gecia Bravo-Hermsdorff, R\'obert Busa-Fekete, Mohammad Ghavamzadeh, Andres Mu\~noz Medina, Umar Syed
Abstract summary: We give improved private and communication-efficient algorithms for estimating several popular measures of the entropy of a distribution. All of our algorithms have constant communication cost and satisfy local differential privacy.
Score: 21.195965074919602
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern statistical estimation is often performed in a distributed setting where each sample belongs to a single user who shares their data with a central server. Users are typically concerned with preserving the privacy of their samples, and also with minimizing the amount of data they must transmit to the server. We give improved private and communication-efficient algorithms for estimating several popular measures of the entropy of a distribution. All of our algorithms have constant communication cost and satisfy local differential privacy. For a joint distribution over many variables whose conditional independence is given by a tree, we describe algorithms for estimating Shannon entropy that require a number of samples that is linear in the number of variables, compared to the quadratic sample complexity of prior work. We also describe an algorithm for estimating Gini entropy whose sample complexity has no dependence on the support size of the distribution and can be implemented using a single round of concurrent communication between the users and the server. In contrast, the previously best-known algorithm has high communication cost and requires the server to facilitate interaction between the users. Finally, we describe an algorithm for estimating collision entropy that generalizes the best known algorithm to the private and communication-efficient setting.

Related papers

Sample-Optimal Private Regression in Polynomial Time [3.3748750222488657]
We show that any improvement to the sample complexity of our algorithm would violate either statistical-query or information-theoretic lower bounds. Our algorithm is robust to a small fraction of arbitrary outliers and achieves optimal error rates as a function of the fraction of outliers.
arXiv Detail & Related papers (2025-03-31T17:08:12Z)
Sublinear Algorithms for Wasserstein and Total Variation Distances: Applications to Fairness and Privacy Auditing [7.81603404636933]
We propose a generic algorithmic framework to estimate the PDF and CDF of any sub-Gaussian distribution while the samples from them arrive in a stream. We compute mergeable summaries of distributions from the stream of samples that require sublinear space w.r.t. the number of observed samples. This allows us to estimate Wasserstein and Total Variation (TV) distances between any two sub-Gaussian while samples arrive in streams and from multiple sources.
arXiv Detail & Related papers (2025-03-10T18:57:48Z)
Scalable Decentralized Algorithms for Online Personalized Mean Estimation [12.002609934938224]
This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean. We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach.
arXiv Detail & Related papers (2024-02-20T08:30:46Z)
Fast Optimal Locally Private Mean Estimation via Random Projections [58.603579803010796]
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. We propose a new algorithmic framework, ProjUnit, for private mean estimation. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.
arXiv Detail & Related papers (2023-06-07T14:07:35Z)
Balancing Privacy and Performance for Private Federated Learning Algorithms [4.681076651230371]
Federated learning (FL) is a distributed machine learning framework where multiple clients collaborate to train a model without exposing their private data. FL algorithms frequently employ a differential privacy mechanism that introduces noise into each client's model updates before sharing. We show that an optimal balance exists between the number of local steps and communication rounds, one that maximizes the convergence performance within a given privacy budget.
arXiv Detail & Related papers (2023-04-11T10:42:11Z)
Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean Estimation [8.9598796481325]
We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems. We establish an information-computation gap for private sparse mean estimation. We also give evidence for privacy-induced information-computation gaps for several other statistics and learning problems.
arXiv Detail & Related papers (2022-11-01T20:03:41Z)
On Differential Privacy for Federated Learning in Wireless Systems with Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference. We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap. Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z)
Collaborative Learning of Distributions under Heterogeneity and Communication Constraints [35.82172666266493]
In machine learning, users often have to collaborate to learn distributions that generate the data. We propose a novel two-stage method named SHIFT: First, the users collaborate by communicating with the server to learn a central distribution. Then, the learned central distribution is fine-tuned to estimate the individual distributions of users.
arXiv Detail & Related papers (2022-06-01T18:43:06Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Lossless Compression of Efficient Private Local Randomizers [55.657133416044104]
Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting. In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server. This has led to significant efforts on reducing the communication cost of LDP algorithms.
arXiv Detail & Related papers (2021-02-24T07:04:30Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices. An increasing number of such databases consist of data from multiple sources, not all of which can be trusted. This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z)
Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems. Our results imply an improved algorithm for the Sample and Aggregate privacy framework. One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.