Differentially Private Relational Learning with Entity-level Privacy Guarantees
- URL: http://arxiv.org/abs/2506.08347v2
- Date: Thu, 12 Jun 2025 19:17:36 GMT
- Title: Differentially Private Relational Learning with Entity-level Privacy Guarantees
- Authors: Yinan Huang, Haoteng Yin, Eli Chien, Rongzhe Wei, Pan Li,
- Abstract summary: This work presents a principled framework for relational learning with formal entity-level DP guarantees.<n>We introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency.<n>These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees.
- Score: 17.567309430451616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach. Our code is available at https://github.com/Graph-COM/Node_DP.
Related papers
- DP-DocLDM: Differentially Private Document Image Generation using Latent Diffusion Models [5.247930659596986]
We aim to address the challenges within the context of document image classification by substituting real private data with a synthetic counterpart.<n>In particular, we propose to use conditional latent diffusion models (LDMs) in combination with differential privacy (DP) to generate class-specific synthetic document images.<n>We show that our approach achieves substantial performance improvements in downstream evaluations on small-scale datasets.
arXiv Detail & Related papers (2025-08-06T08:43:08Z) - Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z) - Privacy-preserving Prompt Personalization in Federated Learning for Multimodal Large Language Models [12.406403248205285]
Federated prompt personalization (FPP) is developed to address data heterogeneity and local overfitting.<n>We propose SecFPP, a secure FPP protocol harmonizing personalization, and privacy guarantees.<n>We show SecFPP significantly outperforms both non-private and privacy-preserving baselines.
arXiv Detail & Related papers (2025-05-28T15:09:56Z) - Privately Learning from Graphs with Applications in Fine-tuning Large Language Models [16.972086279204174]
relational data in sensitive domains such as finance and healthcare often contain private information.
Existing privacy-preserving methods, such as DP-SGD, are not well-suited for relational learning.
We propose a privacy-preserving relational learning pipeline that decouples dependencies in sampled relations during training.
arXiv Detail & Related papers (2024-10-10T18:38:38Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Differentially Private Deep Model-Based Reinforcement Learning [47.651861502104715]
We introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees.
PriMORL learns an ensemble of trajectory-level DP models of the environment from offline data.
arXiv Detail & Related papers (2024-02-08T10:05:11Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - On Privacy and Personalization in Cross-Silo Federated Learning [39.031422430404405]
In this work, we consider the application of differential privacy in cross-silo learning (FL)
We show that mean-regularized multi-task learning (MR-MTL) is a strong baseline for cross-silo FL.
We provide a thorough empirical study of competing methods as well as a theoretical characterization of MR-MTL for a mean estimation problem.
arXiv Detail & Related papers (2022-06-16T03:26:48Z) - Understanding Clipping for Federated Learning: Convergence and
Client-Level Differential Privacy [67.4471689755097]
This paper empirically demonstrates that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity.
We provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients' updates.
arXiv Detail & Related papers (2021-06-25T14:47:19Z) - Secure Sum Outperforms Homomorphic Encryption in (Current) Collaborative
Deep Learning [7.690774882108066]
We discuss methods for training neural networks on the joint data of different data owners, that keep each party's input confidential.
We show that a less complex and computationally less expensive secure sum protocol exhibits superior properties in terms of both collusion-resistance and runtime.
arXiv Detail & Related papers (2020-06-02T23:03:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.