Iterative Methods for Private Synthetic Data: Unifying Framework and New
Methods
- URL: http://arxiv.org/abs/2106.07153v1
- Date: Mon, 14 Jun 2021 04:19:35 GMT
- Title: Iterative Methods for Private Synthetic Data: Unifying Framework and New
Methods
- Authors: Terrance Liu, Giuseppe Vietri, Zhiwei Steven Wu
- Abstract summary: We study private synthetic data generation for query release.
The goal is to construct a sanitized version of a sensitive dataset subject to differential privacy.
Under this framework, we propose two new methods.
- Score: 18.317488965846636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study private synthetic data generation for query release, where the goal
is to construct a sanitized version of a sensitive dataset, subject to
differential privacy, that approximately preserves the answers to a large
collection of statistical queries. We first present an algorithmic framework
that unifies a long line of iterative algorithms in the literature. Under this
framework, we propose two new methods. The first method, private entropy
projection (PEP), can be viewed as an advanced variant of MWEM that adaptively
reuses past query measurements to boost accuracy. Our second method, generative
networks with the exponential mechanism (GEM), circumvents computational
bottlenecks in algorithms such as MWEM and PEP by optimizing over generative
models parameterized by neural networks, which capture a rich family of
distributions while enabling fast gradient-based optimization. We demonstrate
that PEP and GEM empirically outperform existing algorithms. Furthermore, we
show that GEM nicely incorporates prior information from public data while
overcoming limitations of PMW^Pub, the existing state-of-the-art method that
also leverages public data.
Related papers
- Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation [13.009945735929445]
We propose a novel paradigm to solve salient problems plaguing the Automatic Speech Recognition field.
In the first stage, multiple acoustic models are trained based upon different subsets of the complete speech data.
In the second stage, two novel algorithms are utilized to generate a high-quality acoustic model.
arXiv Detail & Related papers (2024-10-21T03:48:23Z) - Unraveling Rodeo Algorithm Through the Zeeman Model [0.0]
We unravel the Rodeo Algorithm to determine the eigenstates and eigenvalues spectrum for a general Hamiltonian considering arbitrary initial states.
We exploit Pennylane and Qiskit platforms resources to analyze scenarios where the Hamiltonians are described by the Zeeman model for one and two spins.
arXiv Detail & Related papers (2024-07-16T01:29:25Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent [8.714458129632158]
Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables.
We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method.
It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
arXiv Detail & Related papers (2021-07-11T10:33:02Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Local Differential Privacy for Bayesian Optimization [12.05395706770007]
We consider a black-box optimization in the nonparametric Gaussian process setting with local differential privacy (LDP) guarantee.
Specifically, the rewards from each user are further corrupted to protect privacy and the learner only has access to the corrupted rewards to minimize the regret.
We present three almost optimal algorithms based on the GP-UCB framework and Laplace DP mechanism.
arXiv Detail & Related papers (2020-10-13T21:50:09Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Best Principal Submatrix Selection for the Maximum Entropy Sampling
Problem: Scalable Algorithms and Performance Guarantees [1.7640556247739623]
MESP has been widely applied to many areas, including healthcare, power system, manufacturing and data science.
We derive a novel convex integer program for MESP and show that its continuous relaxation yields a near-optimal solution.
Our numerical experiments demonstrate that these approximation algorithms can efficiently solve medium-sized and large-scale instances to near-optimality.
arXiv Detail & Related papers (2020-01-23T14:14:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.