Related papers: Privately Learning from Graphs with Applications in Fine-tuning Large Language Models

Privately Learning from Graphs with Applications in Fine-tuning Large Language Models

URL: http://arxiv.org/abs/2410.08299v1
Date: Thu, 10 Oct 2024 18:38:38 GMT
Title: Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
Authors: Haoteng Yin, Rongzhe Wei, Eli Chien, Pan Li,
Abstract summary: relational data in sensitive domains such as finance and healthcare often contain private information. Existing privacy-preserving methods, such as DP-SGD, are not well-suited for relational learning. We propose a privacy-preserving relational learning pipeline that decouples dependencies in sampled relations during training.
Score: 16.972086279204174
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graphs offer unique insights into relationships and interactions between entities, complementing data modalities like text, images, and videos. By incorporating relational information from graph data, AI models can extend their capabilities beyond traditional tasks. However, relational data in sensitive domains such as finance and healthcare often contain private information, making privacy preservation crucial. Existing privacy-preserving methods, such as DP-SGD, which rely on gradient decoupling assumptions, are not well-suited for relational learning due to the inherent dependencies between coupled training samples. To address this challenge, we propose a privacy-preserving relational learning pipeline that decouples dependencies in sampled relations during training, ensuring differential privacy through a tailored application of DP-SGD. We apply this method to fine-tune large language models (LLMs) on sensitive graph data, and tackle the associated computational complexities. Our approach is evaluated on LLMs of varying sizes (e.g., BERT, Llama2) using real-world relational data from four text-attributed graphs. The results demonstrate significant improvements in relational learning tasks, all while maintaining robust privacy guarantees during training. Additionally, we explore the trade-offs between privacy, utility, and computational efficiency, offering insights into the practical deployment of our approach. Code is available at https://github.com/Graph-COM/PvGaLM.

Related papers

Differentially Private Relational Learning with Entity-level Privacy Guarantees [17.567309430451616]
This work presents a principled framework for relational learning with formal entity-level DP guarantees.<n>We introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency.<n>These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees.
arXiv Detail & Related papers (2025-06-10T02:03:43Z)
FedMSGL: A Self-Expressive Hypergraph Based Federated Multi-View Learning [12.161006152509655]
We propose a Self-expressive Hypergraph Based Federated Multi-view Learning method (FedMSGL) The proposed method leverages self-expressive character in the local training to learn uniform dimension subspace with latent sample relation. Experiments on multi-view datasets with different feature dimensions validated the effectiveness of the proposed method.
arXiv Detail & Related papers (2025-03-12T05:13:45Z)
Differentially Private Active Learning: Balancing Effective Data Selection and Privacy [11.716423801223776]
We introduce differentially private active learning (DP-AL) for standard learning settings. We demonstrate that naively integrating DP-SGD training into AL presents substantial challenges in privacy budget allocation and data utilization. Our experiments on vision and natural language processing tasks show that DP-AL can improve performance for specific datasets and model architectures.
arXiv Detail & Related papers (2024-10-01T09:34:06Z)
Approximate Gradient Coding for Privacy-Flexible Federated Learning with Non-IID Data [9.984630251008868]
This work focuses on the challenges of non-IID data and stragglers/dropouts in federated learning. We introduce and explore a privacy-flexible paradigm that models parts of the clients' local data as non-private.
arXiv Detail & Related papers (2024-04-04T15:29:50Z)
Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs [53.03085605769093]
We propose to learn Federated Neural Graph DataBase (FedNGDB), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data.<n>FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities, and improving the overall quality of graph data.
arXiv Detail & Related papers (2024-02-22T14:57:44Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings. To address these concerns, privacy-preserving graph embedding methods have emerged. We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z)
Privacy-Preserving Graph Machine Learning from Data to Computation: A Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z)
Free Lunch for Privacy Preserving Distributed Graph Learning [1.8292714902548342]
We present a novel privacy-respecting framework for distributed graph learning and graph-based machine learning. This framework aims to learn features as well as distances without requiring actual features while preserving the original structural properties of the raw data.
arXiv Detail & Related papers (2023-05-18T10:41:21Z)
Privatized Graph Federated Learning [57.14673504239551]
We introduce graph federated learning, which consists of multiple units connected by a graph. We show how graph homomorphic perturbations can be used to ensure the algorithm is differentially private.
arXiv Detail & Related papers (2022-03-14T13:48:23Z)
Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision. We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs. Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z)
Tensor Graph Convolutional Networks for Multi-relational and Robust Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor. The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.