SoK: Enhancing Cryptographic Collaborative Learning with Differential Privacy
- URL: http://arxiv.org/abs/2601.09460v1
- Date: Wed, 14 Jan 2026 13:09:29 GMT
- Title: SoK: Enhancing Cryptographic Collaborative Learning with Differential Privacy
- Authors: Francesco Capano, Jonas Böhler, Benjamin Weggenmann,
- Abstract summary: In collaborative learning (CL), multiple parties jointly train a machine learning model on their private datasets.<n>Even securely trained models are vulnerable to inference attacks aiming to extract memorized data from model outputs.<n>To ensure output privacy and mitigate inference attacks, differential privacy (DP) injects calibrated noise during training.
- Score: 2.6457630272965074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In collaborative learning (CL), multiple parties jointly train a machine learning model on their private datasets. However, data can not be shared directly due to privacy concerns. To ensure input confidentiality, cryptographic techniques, e.g., multi-party computation (MPC), enable training on encrypted data. Yet, even securely trained models are vulnerable to inference attacks aiming to extract memorized data from model outputs. To ensure output privacy and mitigate inference attacks, differential privacy (DP) injects calibrated noise during training. While cryptography and DP offer complementary guarantees, combining them efficiently for cryptographic and differentially private CL (CPCL) is challenging. Cryptography incurs performance overheads, while DP degrades accuracy, creating a privacy-accuracy-performance trade-off that needs careful design considerations. This work systematizes the CPCL landscape. We introduce a unified framework that generalizes common phases across CPCL paradigms, and identify secure noise sampling as the foundational phase to achieve CPCL. We analyze trade-offs of different secure noise sampling techniques, noise types, and DP mechanisms discussing their implementation challenges and evaluating their accuracy and cryptographic overhead across CPCL paradigms. Additionally, we implement identified secure noise sampling options in MPC and evaluate their computation and communication costs in WAN and LAN. Finally, we propose future research directions based on identified key observations, gaps and possible enhancements in the literature.
Related papers
- A Privacy-Preserving Federated Learning Method with Homomorphic Encryption in Omics Data [19.04813252998036]
Homomorphic Encryption (HE) allows computations on encrypted data and enables aggregation of encrypted gradients without DP-induced noise.<n>We propose a Privacy-Preserving Machine Learning (PPML)-Hybrid method by introducing HE.<n>Our proposed method achieves comparable predictive accuracy while significantly reducing computation time relative to HE-only.
arXiv Detail & Related papers (2025-11-08T16:18:42Z) - Conformal Prediction for Privacy-Preserving Machine Learning [83.88591755871734]
Using AES-encrypted variants of the MNIST dataset, we demonstrate that Conformal Prediction methods remain effective even when applied directly in the encrypted domain.<n>Our work sets a foundation for principled uncertainty quantification in secure, privacy-aware learning systems.
arXiv Detail & Related papers (2025-07-13T15:29:14Z) - Differential Privacy in Machine Learning: From Symbolic AI to LLMs [49.1574468325115]
Differential privacy provides a formal framework to mitigate privacy risks.<n>It ensures that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm.
arXiv Detail & Related papers (2025-06-13T11:30:35Z) - Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning [54.20871516148981]
We introduce the Communication-Efficient and Privacy-Adaptable Mechanism (CEPAM)<n>CEPAM achieves communication efficiency and privacy protection simultaneously.<n>We theoretically analyze the privacy guarantee of CEPAM and investigate the trade-offs among user privacy and accuracy of CEPAM.
arXiv Detail & Related papers (2025-01-21T11:16:05Z) - Privacy Side Channels in Machine Learning Systems [87.53240071195168]
We introduce privacy side channels: attacks that exploit system-level components to extract private information.
For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees.
We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set.
arXiv Detail & Related papers (2023-09-11T16:49:05Z) - Safeguarding Data in Multimodal AI: A Differentially Private Approach to
CLIP Training [15.928338716118697]
We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model.
Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets.
arXiv Detail & Related papers (2023-06-13T23:32:09Z) - When approximate design for fast homomorphic computation provides
differential privacy guarantees [0.08399688944263842]
Differential privacy (DP) and cryptographic primitives are popular countermeasures against privacy attacks.
In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator.
Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework.
arXiv Detail & Related papers (2023-04-06T09:38:01Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition [51.232523987916636]
Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data.
In this work, we extend PATE learning to work with dynamic patterns, namely speech, and perform one very first experimental study on ASR to avoid acoustic data leakage.
arXiv Detail & Related papers (2022-10-11T16:55:54Z) - Protecting Data from all Parties: Combining FHE and DP in Federated
Learning [0.09176056742068812]
We propose a secure framework addressing an extended threat model with respect to privacy of the training data.
The proposed framework protects the privacy of the training data from all participants, namely the training data owners and an aggregating server.
By means of a novel quantization operator, we prove differential privacy guarantees in a context where the noise is quantified and bounded due to the use of homomorphic encryption.
arXiv Detail & Related papers (2022-05-09T14:33:44Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.