Deep Latent Variable Model based Vertical Federated Learning with Flexible Alignment and Labeling Scenarios
- URL: http://arxiv.org/abs/2505.11035v1
- Date: Fri, 16 May 2025 09:30:15 GMT
- Title: Deep Latent Variable Model based Vertical Federated Learning with Flexible Alignment and Labeling Scenarios
- Authors: Kihun Hong, Sejun Park, Ganguk Hwang,
- Abstract summary: Federated learning (FL) has attracted significant attention for enabling collaborative learning without exposing private data.<n>We propose a unified framework that accommodates both training and inference under arbitrary alignment and labeling scenarios.<n>Our method outperforms all baselines in 160 cases with an average gap of 9.6 percentage points over the next-best competitors.
- Score: 8.971234046933349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL) has attracted significant attention for enabling collaborative learning without exposing private data. Among the primary variants of FL, vertical federated learning (VFL) addresses feature-partitioned data held by multiple institutions, each holding complementary information for the same set of users. However, existing VFL methods often impose restrictive assumptions such as a small number of participating parties, fully aligned data, or only using labeled data. In this work, we reinterpret alignment gaps in VFL as missing data problems and propose a unified framework that accommodates both training and inference under arbitrary alignment and labeling scenarios, while supporting diverse missingness mechanisms. In the experiments on 168 configurations spanning four benchmark datasets, six training-time missingness patterns, and seven testing-time missingness patterns, our method outperforms all baselines in 160 cases with an average gap of 9.6 percentage points over the next-best competitors. To the best of our knowledge, this is the first VFL framework to jointly handle arbitrary data alignment, unlabeled data, and multi-party collaboration all at once.
Related papers
- Vertical Federated Learning in Practice: The Good, the Bad, and the Ugly [42.31182713177944]
This survey analyzes the real-world data distributions in potential Vertical Federated Learning (VFL) applications.<n>We propose a novel data-oriented taxonomy of VFL algorithms based on real VFL data distributions.<n>Based on these observations, we outline key research directions aimed at bridging the gap between current VFL research and real-world applications.
arXiv Detail & Related papers (2025-02-12T07:03:32Z) - FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models [48.484485609995986]
Federated learning has enabled multiple parties to collaboratively train large language models without directly sharing their data (FedLLM)
There are currently no realistic datasets and benchmarks for FedLLM.
We propose FedLLM-Bench, which involves 8 training methods, 4 training datasets, and 6 evaluation metrics.
arXiv Detail & Related papers (2024-06-07T11:19:30Z) - Vertical Federated Learning Hybrid Local Pre-training [4.31644387824845]
We propose a novel VFL Hybrid Local Pre-training (VFLHLP) approach for Vertical Federated Learning (VFL)
VFLHLP first pre-trains local networks on the local data of participating parties.
Then it utilizes these pre-trained networks to adjust the sub-model for the labeled party or enhance representation learning for other parties during downstream federated learning on aligned data.
arXiv Detail & Related papers (2024-05-20T08:57:39Z) - Unlocking the Potential of Prompt-Tuning in Bridging Generalized and
Personalized Federated Learning [49.72857433721424]
Vision Transformers (ViT) and Visual Prompt Tuning (VPT) achieve state-of-the-art performance with improved efficiency in various computer vision tasks.
We present a novel algorithm, SGPT, that integrates Generalized FL (GFL) and Personalized FL (PFL) approaches by employing a unique combination of both shared and group-specific prompts.
arXiv Detail & Related papers (2023-10-27T17:22:09Z) - VertiBench: Advancing Feature Distribution Diversity in Vertical
Federated Learning Benchmarks [31.08004805380727]
This paper introduces two key factors affecting VFL performance - feature importance and feature correlation.
We also introduce a real VFL dataset to address the deficit in image-image VFL scenarios.
arXiv Detail & Related papers (2023-07-05T05:55:08Z) - Vertical Semi-Federated Learning for Efficient Online Advertising [50.18284051956359]
Semi-VFL (Vertical Semi-Federated Learning) is proposed to achieve a practical industry application fashion for VFL.
We build an inference-efficient single-party student model applicable to the whole sample space.
New representation distillation methods are designed to extract cross-party feature correlations for both the overlapped and non-overlapped data.
arXiv Detail & Related papers (2022-09-30T17:59:27Z) - UniFed: All-In-One Federated Learning Platform to Unify Open-Source
Frameworks [53.20176108643942]
We present UniFed, the first unified platform for standardizing open-source Federated Learning (FL) frameworks.
UniFed streamlines the end-to-end workflow for distributed experimentation and deployment, encompassing 11 popular open-source FL frameworks.
We evaluate and compare 11 popular FL frameworks from the perspectives of functionality, privacy protection, and performance.
arXiv Detail & Related papers (2022-07-21T05:03:04Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - Multi-Center Federated Learning [62.32725938999433]
Federated learning (FL) can protect data privacy in distributed learning.
It merely collects local gradients from users without access to their data.
We propose a novel multi-center aggregation mechanism.
arXiv Detail & Related papers (2021-08-19T12:20:31Z) - Multi-VFL: A Vertical Federated Learning System for Multiple Data and
Label Owners [10.507522234243021]
We propose a novel method, Multi Vertical Federated Learning (Multi-VFL), to train VFL models when there are multiple data and label owners.
Our results show that using adaptive datasets for model aggregation fastens convergence and improves accuracy.
arXiv Detail & Related papers (2021-06-10T03:00:57Z) - FedSemi: An Adaptive Federated Semi-Supervised Learning Framework [23.90642104477983]
Federated learning (FL) has emerged as an effective technique to co-training machine learning models without actually sharing data and leaking privacy.
Most existing FL methods focus on the supervised setting and ignore the utilization of unlabeled data.
We propose FedSemi, a novel, adaptive, and general framework, which firstly introduces the consistency regularization into FL using a teacher-student model.
arXiv Detail & Related papers (2020-12-06T15:46:04Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.