Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond
- URL: http://arxiv.org/abs/2207.09087v1
- Date: Tue, 19 Jul 2022 05:47:30 GMT
- Title: Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond
- Authors: Yuzheng Hu, Tianle Cai, Jinyong Shan, Shange Tang, Chaochao Cai, Ethan
Song, Bo Li, Dawn Song
- Abstract summary: We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
- Score: 57.10914865054868
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider vertical logistic regression (VLR) trained with mini-batch
gradient descent -- a setting which has attracted growing interest among
industries and proven to be useful in a wide range of applications including
finance and medical research. We provide a comprehensive and rigorous privacy
analysis of VLR in a class of open-source Federated Learning frameworks, where
the protocols might differ between one another, yet a procedure of obtaining
local gradients is implicitly shared. We first consider the honest-but-curious
threat model, in which the detailed implementation of protocol is neglected and
only the shared procedure is assumed, which we abstract as an oracle. We find
that even under this general setting, single-dimension feature and label can
still be recovered from the other party under suitable constraints of batch
size, thus demonstrating the potential vulnerability of all frameworks
following the same philosophy. Then we look into a popular instantiation of the
protocol based on Homomorphic Encryption (HE). We propose an active attack that
significantly weaken the constraints on batch size in the previous analysis via
generating and compressing auxiliary ciphertext. To address the privacy leakage
within the HE-based protocol, we develop a simple-yet-effective countermeasure
based on Differential Privacy (DP), and provide both utility and privacy
guarantees for the updated algorithm. Finally, we empirically verify the
effectiveness of our attack and defense on benchmark datasets. Altogether, our
findings suggest that all vertical federated learning frameworks that solely
depend on HE might contain severe privacy risks, and DP, which has already
demonstrated its power in horizontal federated learning, can also play a
crucial role in the vertical setting, especially when coupled with HE or secure
multi-party computation (MPC) techniques.
Related papers
- Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.
The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.
We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z) - TernaryVote: Differentially Private, Communication Efficient, and
Byzantine Resilient Distributed Optimization on Heterogeneous Data [50.797729676285876]
We propose TernaryVote, which combines a ternary compressor and the majority vote mechanism to realize differential privacy, gradient compression, and Byzantine resilience simultaneously.
We theoretically quantify the privacy guarantee through the lens of the emerging f-differential privacy (DP) and the Byzantine resilience of the proposed algorithm.
arXiv Detail & Related papers (2024-02-16T16:41:14Z) - Privacy-Preserving Distributed Learning for Residential Short-Term Load
Forecasting [11.185176107646956]
Power system load data can inadvertently reveal the daily routines of residential users, posing a risk to their property security.
We introduce a Markovian Switching-based distributed training framework, the convergence of which is substantiated through rigorous theoretical analysis.
Case studies employing real-world power system load data validate the efficacy of our proposed algorithm.
arXiv Detail & Related papers (2024-02-02T16:39:08Z) - Towards Vertical Privacy-Preserving Symbolic Regression via Secure
Multiparty Computation [3.9103337761169947]
Genetic Programming is the standard search technique for Symbolic Regression.
Privacy-preserving research has advanced recently and might offer a solution to this problem, but their application to Symbolic Regression remains largely unexplored.
We propose an approach that employs a privacy-preserving technique called Secure Multiparty Computation to enable parties to jointly build Symbolic Regression models.
arXiv Detail & Related papers (2023-07-22T07:48:42Z) - Practical Privacy-Preserving Gaussian Process Regression via Secret
Sharing [23.80837224347696]
This paper proposes a privacy-preserving GPR method based on secret sharing (SS)
We derive a new SS-based exponentiation operation through the idea of 'confusion-correction' and construct an SS-based matrix inversion algorithm based on Cholesky decomposition.
Empirical results show that our proposed method can achieve reasonable accuracy and efficiency under the premise of preserving data privacy.
arXiv Detail & Related papers (2023-06-26T08:17:51Z) - Theoretically Principled Federated Learning for Balancing Privacy and
Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters.
It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z) - Learning, compression, and leakage: Minimising classification error via
meta-universal compression principles [87.054014983402]
A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding.
Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains PAC learning when applied to a wide variety of models.
We show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
arXiv Detail & Related papers (2020-10-14T20:03:58Z) - FedBoosting: Federated Learning with Gradient Protected Boosting for
Text Recognition [7.988454173034258]
Federated Learning (FL) framework allows learning a shared model collaboratively without data being centralized or shared among data owners.
We show in this paper that the generalization ability of the joint model is poor on Non-Independent and Non-Identically Distributed (Non-IID) data.
We propose a novel boosting algorithm for FL to address both the generalization and gradient leakage issues.
arXiv Detail & Related papers (2020-07-14T18:47:23Z) - SPEED: Secure, PrivatE, and Efficient Deep learning [2.283665431721732]
We introduce a deep learning framework able to deal with strong privacy constraints.
Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art.
arXiv Detail & Related papers (2020-06-16T19:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.