Privacy-Preserving Distributed Learning in the Analog Domain
- URL: http://arxiv.org/abs/2007.08803v1
- Date: Fri, 17 Jul 2020 07:56:39 GMT
- Title: Privacy-Preserving Distributed Learning in the Analog Domain
- Authors: Mahdi Soleymani, Hessam Mahdavifar, A. Salman Avestimehr
- Abstract summary: We consider the problem of distributed learning over data while keeping it private from the computational servers.
We propose a novel algorithm to solve the problem when data is in the analog domain.
We show how the proposed framework can be adopted to do computation tasks when data is represented using floating-point numbers.
- Score: 23.67685616088422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the critical problem of distributed learning over data while
keeping it private from the computational servers. The state-of-the-art
approaches to this problem rely on quantizing the data into a finite field, so
that the cryptographic approaches for secure multiparty computing can then be
employed. These approaches, however, can result in substantial accuracy losses
due to fixed-point representation of the data and computation overflows. To
address these critical issues, we propose a novel algorithm to solve the
problem when data is in the analog domain, e.g., the field of real/complex
numbers. We characterize the privacy of the data from both
information-theoretic and cryptographic perspectives, while establishing a
connection between the two notions in the analog domain. More specifically, the
well-known connection between the distinguishing security (DS) and the mutual
information security (MIS) metrics is extended from the discrete domain to the
continues domain. This is then utilized to bound the amount of information
about the data leaked to the servers in our protocol, in terms of the DS
metric, using well-known results on the capacity of single-input
multiple-output (SIMO) channel with correlated noise. It is shown how the
proposed framework can be adopted to do computation tasks when data is
represented using floating-point numbers. We then show that this leads to a
fundamental trade-off between the privacy level of data and accuracy of the
result. As an application, we also show how to train a machine learning model
while keeping the data as well as the trained model private. Then numerical
results are shown for experiments on the MNIST dataset. Furthermore,
experimental advantages are shown comparing to fixed-point implementations over
finite fields.
Related papers
- Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - LAVA: Data Valuation without Pre-Specified Learning Algorithms [20.578106028270607]
We introduce a new framework that can value training data in a way that is oblivious to the downstream learning algorithm.
We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets.
We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions.
arXiv Detail & Related papers (2023-04-28T19:05:16Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for
Federated Learning on Non-IID Data [69.0785021613868]
Federated learning is a distributed machine learning approach which enables a shared server model to learn by aggregating the locally-computed parameter updates with the training data from spatially-distributed client silos.
We propose the Federated Invariant Learning Consistency (FedILC) approach, which leverages the gradient covariance and the geometric mean of Hessians to capture both inter-silo and intra-silo consistencies.
This is relevant to various fields such as medical healthcare, computer vision, and the Internet of Things (IoT)
arXiv Detail & Related papers (2022-05-19T03:32:03Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - A communication efficient distributed learning framework for smart
environments [0.4898659895355355]
This paper proposes a distributed learning framework to move data analytics closer to where data is generated.
Using distributed machine learning techniques, it is possible to drastically reduce the network overhead, while obtaining performance comparable to the cloud solution.
The analysis also shows when each distributed learning approach is preferable, based on the specific distribution of the data on the nodes.
arXiv Detail & Related papers (2021-09-27T13:44:34Z) - Comprehensive Graph-conditional Similarity Preserving Network for
Unsupervised Cross-modal Hashing [97.44152794234405]
Unsupervised cross-modal hashing (UCMH) has become a hot topic recently.
In this paper, we devise a deep graph-neighbor coherence preserving network (DGCPN)
DGCPN regulates comprehensive similarity preserving losses by exploiting three types of data similarities.
arXiv Detail & Related papers (2020-12-25T07:40:59Z) - Anonymizing Sensor Data on the Edge: A Representation Learning and
Transformation Approach [4.920145245773581]
In this paper, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation.
We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data.
We show that it can anonymize data in real time on resource-constrained edge devices.
arXiv Detail & Related papers (2020-11-16T22:32:30Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.