A Coupled Design of Exploiting Record Similarity for Practical Vertical
Federated Learning
- URL: http://arxiv.org/abs/2106.06312v4
- Date: Thu, 23 Mar 2023 16:19:38 GMT
- Title: A Coupled Design of Exploiting Record Similarity for Practical Vertical
Federated Learning
- Authors: Zhaomin Wu, Qinbin Li, Bingsheng He
- Abstract summary: Federated learning is a learning paradigm to enable collaborative learning across different parties without revealing raw data.
Most existing studies in vertical federated learning disregard the "record linkage" process.
We design a novel coupled training paradigm, FedSim, that integrates one-to-many linkage into the training process.
- Score: 47.77625754666018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning is a learning paradigm to enable collaborative learning
across different parties without revealing raw data. Notably, vertical
federated learning (VFL), where parties share the same set of samples but only
hold partial features, has a wide range of real-world applications. However,
most existing studies in VFL disregard the "record linkage" process. They
design algorithms either assuming the data from different parties can be
exactly linked or simply linking each record with its most similar neighboring
record. These approaches may fail to capture the key features from other less
similar records. Moreover, such improper linkage cannot be corrected by
training since existing approaches provide no feedback on linkage during
training. In this paper, we design a novel coupled training paradigm, FedSim,
that integrates one-to-many linkage into the training process. Besides enabling
VFL in many real-world applications with fuzzy identifiers, FedSim also
achieves better performance in traditional VFL tasks. Moreover, we
theoretically analyze the additional privacy risk incurred by sharing
similarities. Our experiments on eight datasets with various similarity metrics
show that FedSim outperforms other state-of-the-art baselines. The codes of
FedSim are available at https://github.com/Xtra-Computing/FedSim.
Related papers
- Stalactite: Toolbox for Fast Prototyping of Vertical Federated Learning Systems [37.11550251825938]
We present emphStalactite - an open-source framework for Vertical Federated Learning (VFL) systems.
VFL is a type of FL where data samples are divided by features across several data owners.
We demonstrate its use on a real-world recommendation datasets.
arXiv Detail & Related papers (2024-09-23T21:29:03Z) - A Universal Metric of Dataset Similarity for Cross-silo Federated Learning [0.0]
Federated learning is increasingly used in domains such as healthcare to facilitate model training without data-sharing.
In this paper, we propose a novel metric for assessing dataset similarity.
We show that our metric shows a robust and interpretable relationship with model performance and can be calculated in privacy-preserving manner.
arXiv Detail & Related papers (2024-04-29T15:08:24Z) - Effective and Efficient Federated Tree Learning on Hybrid Data [80.31870543351918]
We propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data.
We observe the existence of consistent split rules in trees and show that the knowledge of parties can be incorporated into the lower layers of a tree.
Our experiments demonstrate that HybridTree can achieve comparable accuracy to the centralized setting with low computational and communication overhead.
arXiv Detail & Related papers (2023-10-18T10:28:29Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in
Realistic Healthcare Settings [51.09574369310246]
Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models.
We propose a novel cross-silo dataset suite focused on healthcare, FLamby, to bridge the gap between theory and practice of cross-silo FL.
Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research.
arXiv Detail & Related papers (2022-10-10T12:17:30Z) - Privatized Graph Federated Learning [57.14673504239551]
We introduce graph federated learning, which consists of multiple units connected by a graph.
We show how graph homomorphic perturbations can be used to ensure the algorithm is differentially private.
arXiv Detail & Related papers (2022-03-14T13:48:23Z) - DVFL: A Vertical Federated Learning Method for Dynamic Data [2.406222636382325]
This paper studies vertical federated learning (VFL), which tackles the scenarios where collaborating organizations share the same set of users but disjoint features.
We propose a new vertical federation learning method, DVFL, which adapts to dynamic data distribution changes through knowledge distillation.
Our extensive experimental results show that DVFL can not only obtain results close to existing VFL methods in static scenes, but also adapt to changes in data distribution in dynamic scenarios.
arXiv Detail & Related papers (2021-11-05T09:26:09Z) - Practical One-Shot Federated Learning for Cross-Silo Setting [114.76232507580067]
One-shot federated learning is a promising approach to make federated learning applicable in cross-silo setting.
We propose a practical one-shot federated learning algorithm named FedKT.
By utilizing the knowledge transfer technique, FedKT can be applied to any classification models and can flexibly achieve differential privacy guarantees.
arXiv Detail & Related papers (2020-10-02T14:09:10Z) - FedCVT: Semi-supervised Vertical Federated Learning with Cross-view Training [9.638604434238882]
Federated Cross-view Training (FedCVT) is a semi-supervised learning approach that improves the performance of a vertical federated learning model.
FedCVT does not require parties to share their original data and model parameters, thus preserving data privacy.
arXiv Detail & Related papers (2020-08-25T06:20:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.