On InstaHide, Phase Retrieval, and Sparse Matrix Factorization
- URL: http://arxiv.org/abs/2011.11181v2
- Date: Thu, 25 Mar 2021 00:08:38 GMT
- Title: On InstaHide, Phase Retrieval, and Sparse Matrix Factorization
- Authors: Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo
- Abstract summary: InstaHide is a scheme for preserving the security of private datasets in the context of distributed learning.
A salient question is whether this scheme is secure in any provable sense.
We show that the answer appears to be quite subtle and closely related to the average-case complexity of a new multi-task, missing-data version of the classic problem of phase retrieval.
- Score: 32.93116534310763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we examine the security of InstaHide, a scheme recently
proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of
private datasets in the context of distributed learning. To generate a
synthetic training example to be shared among the distributed learners,
InstaHide takes a convex combination of private feature vectors and randomly
flips the sign of each entry of the resulting vector with probability 1/2. A
salient question is whether this scheme is secure in any provable sense,
perhaps under a plausible hardness assumption and assuming the distributions
generating the public and private data satisfy certain properties.
We show that the answer to this appears to be quite subtle and closely
related to the average-case complexity of a new multi-task, missing-data
version of the classic problem of phase retrieval. Motivated by this
connection, we design a provable algorithm that can recover private vectors
using only the public vectors and synthetic vectors generated by InstaHide,
under the assumption that the private and public vectors are isotropic
Gaussian.
Related papers
- PREAMBLE: Private and Efficient Aggregation of Block Sparse Vectors and Applications [42.968231105076335]
We revisit the problem of secure aggregation of high-dimensional vectors in a two-server system such as Prio.
PREAMBLE is a novel extension of distributed point functions that enables communication- and computation-efficient aggregation.
arXiv Detail & Related papers (2025-03-14T21:58:15Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Learning a Gaussian Mixture for Sparsity Regularization in Inverse
Problems [2.375943263571389]
In inverse problems, the incorporation of a sparsity prior yields a regularization effect on the solution.
We propose a probabilistic sparsity prior formulated as a mixture of Gaussians, capable of modeling sparsity with respect to a generic basis.
We put forth both a supervised and an unsupervised training strategy to estimate the parameters of this network.
arXiv Detail & Related papers (2024-01-29T22:52:57Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Deterministic equivalent and error universality of deep random features
learning [4.8461049669050915]
This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures.
First, we prove universality of the test error in a universality ridge setting where the learner and target networks share the same intermediate layers, and provide a sharp formula for it.
Second, we conjecture the universality of the test error in the more general setting of arbitrary convex losses and generic learner/target architectures.
arXiv Detail & Related papers (2023-02-01T12:37:10Z) - Differential Secrecy for Distributed Data and Applications to Robust
Differentially Secure Vector Summation [32.004283989604154]
We present a protocol for vector summation that verifies that the Euclidean norm of each contribution is approximately bounded.
Unlike SMC algorithms that inevitably cast integers to elements of a large finite field, our algorithms work over integers/reals, which may allow for additional efficiencies.
arXiv Detail & Related papers (2022-02-22T02:06:42Z) - Differential Privacy of Dirichlet Posterior Sampling [0.0]
We study the inherent privacy of releasing a single draw from a Dirichlet posterior distribution.
With the notion of truncated concentrated differential privacy (tCDP), we are able to derive a simple privacy guarantee of the Dirichlet posterior sampling.
arXiv Detail & Related papers (2021-10-03T07:41:19Z) - Causally-motivated Shortcut Removal Using Auxiliary Labels [63.686580185674195]
Key challenge to learning such risk-invariant predictors is shortcut learning.
We propose a flexible, causally-motivated approach to address this challenge.
We show both theoretically and empirically that this causally-motivated regularization scheme yields robust predictors.
arXiv Detail & Related papers (2021-05-13T16:58:45Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Parsimonious Feature Extraction Methods: Extending Robust Probabilistic
Projections with Generalized Skew-t [0.8336315962271392]
We propose a novel generalisation to the Student-t Probabilistic Principal Component methodology.
The new framework provides a more flexible approach to modelling groups of marginal tail dependence in the observation data.
The applicability of the new framework is illustrated on a data set that consists of crypto currencies with the highest market capitalisation.
arXiv Detail & Related papers (2020-09-24T05:53:41Z) - Learning Output Embeddings in Structured Prediction [73.99064151691597]
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension.
A prediction in the original space is computed by solving a pre-image problem.
In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space.
arXiv Detail & Related papers (2020-07-29T09:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.