Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
- URL: http://arxiv.org/abs/2410.08299v2
- Date: Tue, 16 Sep 2025 18:52:07 GMT
- Title: Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
- Authors: Haoteng Yin, Rongzhe Wei, Eli Chien, Pan Li,
- Abstract summary: Learning from graphs often involves handling sensitive relationships in the data.<n>Existing privacy-preserving methods, such as DP-SGD, rely on gradient decoupling assumptions.<n>We propose a privacy-preserving pipeline for relational learning that decouples dependencies in sampled relations for training.
- Score: 20.63522374493782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graphs offer unique insights into relationships between entities, complementing data modalities like text and images and enabling AI models to extend their capabilities beyond traditional tasks. However, learning from graphs often involves handling sensitive relationships in the data, raising significant privacy concerns. Existing privacy-preserving methods, such as DP-SGD, rely on gradient decoupling assumptions and are incompatible with relational learning due to the inherent dependencies between training samples. To address this challenge, we propose a privacy-preserving pipeline for relational learning that decouples dependencies in sampled relations for training, ensuring differential privacy through a tailored application of DP-SGD. We apply this approach to fine-tune large language models (LLMs), such as Llama2, on sensitive graph data while addressing the associated computational complexities. Our method is evaluated on four real-world text-attributed graphs, demonstrating significant improvements in relational learning tasks while maintaining robust privacy guarantees. Additionally, we analyze the trade-offs between privacy, utility, and computational efficiency, offering insights into the practical deployment of our approach for privacy-preserving relational learning. Code is available at https://github.com/Graph-COM/PvGaLM.
Related papers
- Differentially Private Relational Learning with Entity-level Privacy Guarantees [17.567309430451616]
This work presents a principled framework for relational learning with formal entity-level DP guarantees.<n>We introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency.<n>These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees.
arXiv Detail & Related papers (2025-06-10T02:03:43Z) - FedMSGL: A Self-Expressive Hypergraph Based Federated Multi-View Learning [12.161006152509655]
We propose a Self-expressive Hypergraph Based Federated Multi-view Learning method (FedMSGL)
The proposed method leverages self-expressive character in the local training to learn uniform dimension subspace with latent sample relation.
Experiments on multi-view datasets with different feature dimensions validated the effectiveness of the proposed method.
arXiv Detail & Related papers (2025-03-12T05:13:45Z) - Differentially Private Active Learning: Balancing Effective Data Selection and Privacy [11.716423801223776]
We introduce differentially private active learning (DP-AL) for standard learning settings.
We demonstrate that naively integrating DP-SGD training into AL presents substantial challenges in privacy budget allocation and data utilization.
Our experiments on vision and natural language processing tasks show that DP-AL can improve performance for specific datasets and model architectures.
arXiv Detail & Related papers (2024-10-01T09:34:06Z) - Approximate Gradient Coding for Privacy-Flexible Federated Learning with Non-IID Data [9.984630251008868]
This work focuses on the challenges of non-IID data and stragglers/dropouts in federated learning.
We introduce and explore a privacy-flexible paradigm that models parts of the clients' local data as non-private.
arXiv Detail & Related papers (2024-04-04T15:29:50Z) - Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs [53.03085605769093]
We propose to learn Federated Neural Graph DataBase (FedNGDB), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data.<n>FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities, and improving the overall quality of graph data.
arXiv Detail & Related papers (2024-02-22T14:57:44Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Free Lunch for Privacy Preserving Distributed Graph Learning [1.8292714902548342]
We present a novel privacy-respecting framework for distributed graph learning and graph-based machine learning.
This framework aims to learn features as well as distances without requiring actual features while preserving the original structural properties of the raw data.
arXiv Detail & Related papers (2023-05-18T10:41:21Z) - Privacy-Preserved Neural Graph Similarity Learning [99.78599103903777]
We propose a novel Privacy-Preserving neural Graph Matching network model, named PPGM, for graph similarity learning.
To prevent reconstruction attacks, the proposed model does not communicate node-level representations between devices.
To alleviate the attacks to graph properties, the obfuscated features that contain information from both vectors are communicated.
arXiv Detail & Related papers (2022-10-21T04:38:25Z) - Privatized Graph Federated Learning [57.14673504239551]
We introduce graph federated learning, which consists of multiple units connected by a graph.
We show how graph homomorphic perturbations can be used to ensure the algorithm is differentially private.
arXiv Detail & Related papers (2022-03-14T13:48:23Z) - Federated Knowledge Graphs Embedding [50.35484170815679]
We propose a novel decentralized scalable learning framework, Federated Knowledge Graphs Embedding (FKGE)
FKGE exploits adversarial generation between pairs of knowledge graphs to translate identical entities and relations of different domains into near embedding spaces.
In order to protect the privacy of the training data, FKGE further implements a privacy-preserving neural network structure to guarantee no raw data leakage.
arXiv Detail & Related papers (2021-05-17T05:30:41Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.