Private and Communication-Efficient Algorithms for Entropy Estimation
- URL: http://arxiv.org/abs/2305.07751v1
- Date: Fri, 12 May 2023 20:35:10 GMT
- Title: Private and Communication-Efficient Algorithms for Entropy Estimation
- Authors: Gecia Bravo-Hermsdorff, R\'obert Busa-Fekete, Mohammad Ghavamzadeh,
Andres Mu\~noz Medina, Umar Syed
- Abstract summary: We give improved private and communication-efficient algorithms for estimating several popular measures of the entropy of a distribution.
All of our algorithms have constant communication cost and satisfy local differential privacy.
- Score: 21.195965074919602
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern statistical estimation is often performed in a distributed setting
where each sample belongs to a single user who shares their data with a central
server. Users are typically concerned with preserving the privacy of their
samples, and also with minimizing the amount of data they must transmit to the
server. We give improved private and communication-efficient algorithms for
estimating several popular measures of the entropy of a distribution. All of
our algorithms have constant communication cost and satisfy local differential
privacy. For a joint distribution over many variables whose conditional
independence is given by a tree, we describe algorithms for estimating Shannon
entropy that require a number of samples that is linear in the number of
variables, compared to the quadratic sample complexity of prior work. We also
describe an algorithm for estimating Gini entropy whose sample complexity has
no dependence on the support size of the distribution and can be implemented
using a single round of concurrent communication between the users and the
server. In contrast, the previously best-known algorithm has high communication
cost and requires the server to facilitate interaction between the users.
Finally, we describe an algorithm for estimating collision entropy that
generalizes the best known algorithm to the private and communication-efficient
setting.
Related papers
- Scalable Decentralized Algorithms for Online Personalized Mean Estimation [12.002609934938224]
This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean.
We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach.
arXiv Detail & Related papers (2024-02-20T08:30:46Z) - Fast Optimal Locally Private Mean Estimation via Random Projections [58.603579803010796]
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball.
We propose a new algorithmic framework, ProjUnit, for private mean estimation.
Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.
arXiv Detail & Related papers (2023-06-07T14:07:35Z) - Balancing Privacy and Performance for Private Federated Learning
Algorithms [4.681076651230371]
Federated learning (FL) is a distributed machine learning framework where multiple clients collaborate to train a model without exposing their private data.
FL algorithms frequently employ a differential privacy mechanism that introduces noise into each client's model updates before sharing.
We show that an optimal balance exists between the number of local steps and communication rounds, one that maximizes the convergence performance within a given privacy budget.
arXiv Detail & Related papers (2023-04-11T10:42:11Z) - Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean
Estimation [8.9598796481325]
We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems.
We establish an information-computation gap for private sparse mean estimation.
We also give evidence for privacy-induced information-computation gaps for several other statistics and learning problems.
arXiv Detail & Related papers (2022-11-01T20:03:41Z) - On Differential Privacy for Federated Learning in Wireless Systems with
Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference.
We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap.
Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z) - Collaborative Learning of Distributions under Heterogeneity and
Communication Constraints [35.82172666266493]
In machine learning, users often have to collaborate to learn distributions that generate the data.
We propose a novel two-stage method named SHIFT: First, the users collaborate by communicating with the server to learn a central distribution.
Then, the learned central distribution is fine-tuned to estimate the individual distributions of users.
arXiv Detail & Related papers (2022-06-01T18:43:06Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Lossless Compression of Efficient Private Local Randomizers [55.657133416044104]
Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting.
In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server.
This has led to significant efforts on reducing the communication cost of LDP algorithms.
arXiv Detail & Related papers (2021-02-24T07:04:30Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices.
An increasing number of such databases consist of data from multiple sources, not all of which can be trusted.
This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z) - Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems.
Our results imply an improved algorithm for the Sample and Aggregate privacy framework.
One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.