Embedding-Based Federated Data Sharing via Differentially Private Conditional VAEs
- URL: http://arxiv.org/abs/2507.02671v1
- Date: Thu, 03 Jul 2025 14:36:15 GMT
- Title: Embedding-Based Federated Data Sharing via Differentially Private Conditional VAEs
- Authors: Francesco Di Salvo, Hanh Huyen My Nguyen, Christian Ledig,
- Abstract summary: Federated Learning (FL) enables decentralized training but suffers from high communication costs.<n>We propose a data-sharing method via Differentially Private (DP) generative models.<n>Clients collaboratively train a Differentially Private Conditional Variational Autoencoder (DP-CVAE) to model a global, privacy-aware data distribution.
- Score: 0.13108652488669734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning (DL) has revolutionized medical imaging, yet its adoption is constrained by data scarcity and privacy regulations, limiting access to diverse datasets. Federated Learning (FL) enables decentralized training but suffers from high communication costs and is often restricted to a single downstream task, reducing flexibility. We propose a data-sharing method via Differentially Private (DP) generative models. By adopting foundation models, we extract compact, informative embeddings, reducing redundancy and lowering computational overhead. Clients collaboratively train a Differentially Private Conditional Variational Autoencoder (DP-CVAE) to model a global, privacy-aware data distribution, supporting diverse downstream tasks. Our approach, validated across multiple feature extractors, enhances privacy, scalability, and efficiency, outperforming traditional FL classifiers while ensuring differential privacy. Additionally, DP-CVAE produces higher-fidelity embeddings than DP-CGAN while requiring $5{\times}$ fewer parameters.
Related papers
- CorBin-FL: A Differentially Private Federated Learning Mechanism using Common Randomness [6.881974834597426]
Federated learning (FL) has emerged as a promising framework for distributed machine learning.
We introduce CorBin-FL, a privacy mechanism that uses correlated binary quantization to achieve differential privacy.
We also propose AugCorBin-FL, an extension that, in addition to PLDP, user-level and sample-level central differential privacy guarantees.
arXiv Detail & Related papers (2024-09-20T00:23:44Z) - Privacy-preserving datasets by capturing feature distributions with Conditional VAEs [0.11999555634662634]
Conditional Variational Autoencoders (CVAEs) trained on feature vectors extracted from large pre-trained vision foundation models.
Our method notably outperforms traditional approaches in both medical and natural image domains.
Results underscore the potential of generative models to significantly impact deep learning applications in data-scarce and privacy-sensitive environments.
arXiv Detail & Related papers (2024-08-01T15:26:24Z) - DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation [15.023077875990614]
Federated learning (FL) allows clients to collaboratively train a global model without sharing their local data with a server.<n> Differential privacy (DP) addresses such leakage by providing formal privacy guarantees, with mechanisms that add randomness to the clients' contributions.<n>We propose an adaptation method that can be combined with differential privacy and call it DP-DyLoRA.
arXiv Detail & Related papers (2024-05-10T10:10:37Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - ALI-DPFL: Differentially Private Federated Learning with Adaptive Local Iterations [26.310416723272184]
Federated Learning (FL) is a distributed machine learning technique that allows model training among multiple devices or organizations by sharing training parameters instead of raw data.
adversaries can still infer individual information through inference attacks on these training parameters. Differential Privacy (DP) has been widely used in FL to prevent such attacks.
We consider differentially private federated learning in a resource-constrained scenario, where both privacy budget and communication rounds are constrained.
arXiv Detail & Related papers (2023-08-21T04:09:59Z) - FedLAP-DP: Federated Learning by Sharing Differentially Private Loss Approximations [53.268801169075836]
We propose FedLAP-DP, a novel privacy-preserving approach for federated learning.
A formal privacy analysis demonstrates that FedLAP-DP incurs the same privacy costs as typical gradient-sharing schemes.
Our approach presents a faster convergence speed compared to typical gradient-sharing methods.
arXiv Detail & Related papers (2023-02-02T12:56:46Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Compression Boosts Differentially Private Federated Learning [0.7742297876120562]
Federated learning allows distributed entities to train a common model collaboratively without sharing their own data.
It remains vulnerable to various inference and reconstruction attacks where a malicious entity can learn private information about the participants' training data from the captured gradients.
We show experimentally, using 2 datasets, that our privacy-preserving proposal can reduce the communication costs by up to 95% with only a negligible performance penalty compared to traditional non-private federated learning schemes.
arXiv Detail & Related papers (2020-11-10T13:11:03Z) - GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially
Private Generators [74.16405337436213]
We propose Gradient-sanitized Wasserstein Generative Adrial Networks (GS-WGAN)
GS-WGAN allows releasing a sanitized form of sensitive data with rigorous privacy guarantees.
We find our approach consistently outperforms state-of-the-art approaches across multiple metrics.
arXiv Detail & Related papers (2020-06-15T10:01:01Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.