Federated Graph Representation Learning using Self-Supervision
- URL: http://arxiv.org/abs/2210.15120v1
- Date: Thu, 27 Oct 2022 02:13:42 GMT
- Title: Federated Graph Representation Learning using Self-Supervision
- Authors: Susheel Suresh, Danny Godbout, Arko Mukherjee, Mayank Shrivastava,
Jennifer Neville, Pan Li
- Abstract summary: Federated graph representation learning (FedGRL) brings the benefits of distributed training to graph structured data while simultaneously addressing some privacy and compliance concerns related to data curation.
We consider a realistic and novel problem setting, wherein cross-silo clients have access to vast amounts of unlabeled data with limited or no labeled data and additionally have diverse downstream class label domains.
We propose a novel FedGRL formulation based on model where we aim to learn a shared global model that is optimized collaboratively using a self-supervised objective and gets downstream task supervision through local client models.
- Score: 18.015793175772835
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Federated graph representation learning (FedGRL) brings the benefits of
distributed training to graph structured data while simultaneously addressing
some privacy and compliance concerns related to data curation. However, several
interesting real-world graph data characteristics viz. label deficiency and
downstream task heterogeneity are not taken into consideration in current
FedGRL setups. In this paper, we consider a realistic and novel problem
setting, wherein cross-silo clients have access to vast amounts of unlabeled
data with limited or no labeled data and additionally have diverse downstream
class label domains. We then propose a novel FedGRL formulation based on model
interpolation where we aim to learn a shared global model that is optimized
collaboratively using a self-supervised objective and gets downstream task
supervision through local client models. We provide a specific instantiation of
our general formulation using BGRL a SoTA self-supervised graph representation
learning method and we empirically verify its effectiveness through realistic
cross-slio datasets: (1) we adapt the Twitch Gamer Network which naturally
simulates a cross-geo scenario and show that our formulation can provide
consistent and avg. 6.1% gains over traditional supervised federated learning
objectives and on avg. 1.7% gains compared to individual client specific
self-supervised training and (2) we construct and introduce a new cross-silo
dataset called Amazon Co-purchase Networks that have both the characteristics
of the motivated problem setting. And, we witness on avg. 11.5% gains over
traditional supervised federated learning and on avg. 1.9% gains over
individually trained self-supervised models. Both experimental results point to
the effectiveness of our proposed formulation. Finally, both our novel problem
setting and dataset contributions provide new avenues for the research in
FedGRL.
Related papers
- Personalized federated learning based on feature fusion [2.943623084019036]
Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy.
We propose a personalized federated learning approach called pFedPM.
In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models.
arXiv Detail & Related papers (2024-06-24T12:16:51Z) - Decoupled Federated Learning on Long-Tailed and Non-IID data with
Feature Statistics [20.781607752797445]
We propose a two-stage Decoupled Federated learning framework using Feature Statistics (DFL-FS)
In the first stage, the server estimates the client's class coverage distributions through masked local feature statistics clustering.
In the second stage, DFL-FS employs federated feature regeneration based on global feature statistics to enhance the model's adaptability to long-tailed data distributions.
arXiv Detail & Related papers (2024-03-13T09:24:59Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Analyzing the Effect of Sampling in GNNs on Individual Fairness [79.28449844690566]
Graph neural network (GNN) based methods have saturated the field of recommender systems.
We extend an existing method for promoting individual fairness on graphs to support mini-batch, or sub-sample based, training of a GNN.
We show that mini-batch training facilitate individual fairness promotion by allowing for local nuance to guide the process of fairness promotion in representation learning.
arXiv Detail & Related papers (2022-09-08T16:20:25Z) - FedEgo: Privacy-preserving Personalized Federated Graph Learning with
Ego-graphs [22.649780281947837]
In some practical scenarios, graph data are stored separately in multiple distributed parties, which may not be directly shared due to conflicts of interest.
We propose FedEgo, a federated graph learning framework based on ego-graphs to tackle the challenges above.
arXiv Detail & Related papers (2022-08-29T15:47:36Z) - Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data.
We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks.
We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - CatFedAvg: Optimising Communication-efficiency and Classification
Accuracy in Federated Learning [2.2172881631608456]
We introduce a new family of Federated Learning algorithms called CatFedAvg.
It improves the communication efficiency but improves the quality of learning using a category coverage inNIST strategy.
Our experiments show that an increase of 10% absolute points accuracy using the M dataset with 70% absolute points lower network transfer over FedAvg.
arXiv Detail & Related papers (2020-11-14T06:52:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.