A Vertical Federated Learning Framework for Horizontally Partitioned
Labels
- URL: http://arxiv.org/abs/2106.10056v1
- Date: Fri, 18 Jun 2021 11:10:11 GMT
- Title: A Vertical Federated Learning Framework for Horizontally Partitioned
Labels
- Authors: Wensheng Xia, Ying Li, Lan Zhang, Zhonghai Wu, Xiaoyong Yuan
- Abstract summary: Most existing vertical federated learning methods have a strong assumption that at least one party holds the complete set of labels of all data samples.
Existing vertical federated learning methods can only utilize partial labels, which may lead to inadequate model update in end-to-end backpropagation.
We propose a novel vertical federated learning framework named Cascade Vertical Federated Learning (CVFL) to fully utilize all horizontally partitioned labels to train neural networks with privacy-preservation.
- Score: 12.433809611989155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vertical federated learning is a collaborative machine learning framework to
train deep leaning models on vertically partitioned data with
privacy-preservation. It attracts much attention both from academia and
industry. Unfortunately, applying most existing vertical federated learning
methods in real-world applications still faces two daunting challenges. First,
most existing vertical federated learning methods have a strong assumption that
at least one party holds the complete set of labels of all data samples, while
this assumption is not satisfied in many practical scenarios, where labels are
horizontally partitioned and the parties only hold partial labels. Existing
vertical federated learning methods can only utilize partial labels, which may
lead to inadequate model update in end-to-end backpropagation. Second,
computational and communication resources vary in parties. Some parties with
limited computational and communication resources will become the stragglers
and slow down the convergence of training. Such straggler problem will be
exaggerated in the scenarios of horizontally partitioned labels in vertical
federated learning. To address these challenges, we propose a novel vertical
federated learning framework named Cascade Vertical Federated Learning (CVFL)
to fully utilize all horizontally partitioned labels to train neural networks
with privacy-preservation. To mitigate the straggler problem, we design a novel
optimization objective which can increase straggler's contribution to the
trained models. We conduct a series of qualitative experiments to rigorously
verify the effectiveness of CVFL. It is demonstrated that CVFL can achieve
comparable performance (e.g., accuracy for classification tasks) with
centralized training. The new optimization objective can further mitigate the
straggler problem comparing with only using the asynchronous aggregation
mechanism during training.
Related papers
- Vertical Federated Learning with Missing Features During Training and Inference [37.44022318612869]
We propose a vertical federated learning method for efficient training and inference of neural network-based models.
Our approach is simple yet effective, relying on the strategic sharing of parameters on task-sampling and inference.
Numerical experiments show improved performance of LASER-VFL over the baselines.
arXiv Detail & Related papers (2024-10-29T22:09:31Z) - Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation [10.404379188947383]
Two-party split learning has been proven to survive label inference attacks.
We propose a novel two-party split learning method to defend against existing label inference attacks.
arXiv Detail & Related papers (2024-10-11T09:25:21Z) - Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Communication-Efficient Hybrid Federated Learning for E-health with Horizontal and Vertical Data Partitioning [67.49221252724229]
E-health allows smart devices and medical institutions to collaboratively collect patients' data, which is trained by Artificial Intelligence (AI) technologies to help doctors make diagnosis.
Applying federated learning in e-health faces many challenges.
Medical data is both horizontally and vertically partitioned.
A naive combination of HFL and VFL has limitations including low training efficiency, unsound convergence analysis, and lack of parameter tuning strategies.
arXiv Detail & Related papers (2024-04-15T19:45:07Z) - FedAnchor: Enhancing Federated Semi-Supervised Learning with Label
Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices.
We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server.
Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z) - FedEmb: A Vertical and Hybrid Federated Learning Algorithm using Network
And Feature Embedding Aggregation [24.78757412559944]
Federated learning (FL) is an emerging paradigm for decentralized training of machine learning models on distributed clients.
In this paper, we propose a generalized algorithm FedEmb, for modelling vertical and hybrid-based learning.
The experimental results show that FedEmb is an effective method to tackle both split feature & subject space decentralized problems.
arXiv Detail & Related papers (2023-11-30T16:01:51Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - FedV: Privacy-Preserving Federated Learning over Vertically Partitioned
Data [12.815996963583641]
Federated learning (FL) has been proposed to allow collaborative training of machine learning (ML) models among multiple parties.
We propose FedV, a framework for secure gradient computation in vertical settings for several widely used ML models.
We show a reduction of 10%-70% of training time and 80% to 90% in data transfer with respect to the state-of-the-art approaches.
arXiv Detail & Related papers (2021-03-05T19:59:29Z) - Secure Bilevel Asynchronous Vertical Federated Learning with Backward
Updating [159.48259714642447]
Vertical scalable learning (VFL) attracts increasing attention due to the demands of multi-party collaborative modeling and concerns of privacy leakage.
We propose a novel bftextlevel parallel architecture (VF$bfB2$), under which three new algorithms, including VF$B2$, are proposed.
arXiv Detail & Related papers (2021-03-01T12:34:53Z) - Privacy-Preserving Asynchronous Federated Learning Algorithms for
Multi-Party Vertically Collaborative Learning [151.47900584193025]
We propose an asynchronous federated SGD (AFSGD-VP) algorithm and its SVRG and SAGA variants on the vertically partitioned data.
To the best of our knowledge, AFSGD-VP and its SVRG and SAGA variants are the first asynchronous federated learning algorithms for vertically partitioned data.
arXiv Detail & Related papers (2020-08-14T08:08:15Z) - Federated Semi-Supervised Learning with Inter-Client Consistency &
Disjoint Learning [78.88007892742438]
We study two essential scenarios of Federated Semi-Supervised Learning (FSSL) based on the location of the labeled data.
We propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch)
arXiv Detail & Related papers (2020-06-22T09:43:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.