FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated
Learning
- URL: http://arxiv.org/abs/2302.10417v1
- Date: Tue, 21 Feb 2023 03:09:45 GMT
- Title: FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated
Learning
- Authors: Anran Li, Hongyi Peng, Lan Zhang, Jiahui Huang, Qing Guo, Han Yu, Yang
Liu
- Abstract summary: Vertical Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s) to jointly train a useful global model.
Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features.
We propose the Federated Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian dual-gate to efficiently approximate the probability of a feature being selected, with privacy
- Score: 21.79965380400454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vertical Federated Learning (VFL) enables multiple data owners, each holding
a different subset of features about largely overlapping sets of data
sample(s), to jointly train a useful global model. Feature selection (FS) is
important to VFL. It is still an open research problem as existing FS works
designed for VFL either assumes prior knowledge on the number of noisy features
or prior knowledge on the post-training threshold of useful features to be
selected, making them unsuitable for practical applications. To bridge this
gap, we propose the Federated Stochastic Dual-Gate based Feature Selection
(FedSDG-FS) approach. It consists of a Gaussian stochastic dual-gate to
efficiently approximate the probability of a feature being selected, with
privacy protection through Partially Homomorphic Encryption without a trusted
third-party. To reduce overhead, we propose a feature importance initialization
method based on Gini impurity, which can accomplish its goals with only two
parameter transmissions between the server and the clients. Extensive
experiments on both synthetic and real-world datasets show that FedSDG-FS
significantly outperforms existing approaches in terms of achieving accurate
selection of high-quality features as well as building global models with
improved performance.
Related papers
- Sequential Federated Learning in Hierarchical Architecture on Non-IID Datasets [25.010661914466354]
In a real federated learning (FL) system, communication overhead for passing model parameters between the clients and the parameter (PS) is often a bottleneck.
We propose sequential FL (SFL) HFL for the first time, which removes the central PS and enables the model to be completed only through passing data between two adjacent ESs for each server.
arXiv Detail & Related papers (2024-08-19T07:43:35Z) - Tackling Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation [12.19025665853089]
In traditional Federated Learning approaches, the global model underperforms when faced with data heterogeneity.
We propose a new PFL framework called FedPFT to address the mismatch problem while enhancing the quality of the feature extractor.
Our experiments demonstrate that FedPFT outperforms state-of-the-art methods by up to 7.08%.
arXiv Detail & Related papers (2024-07-23T02:52:52Z) - Decoupled Federated Learning on Long-Tailed and Non-IID data with
Feature Statistics [20.781607752797445]
We propose a two-stage Decoupled Federated learning framework using Feature Statistics (DFL-FS)
In the first stage, the server estimates the client's class coverage distributions through masked local feature statistics clustering.
In the second stage, DFL-FS employs federated feature regeneration based on global feature statistics to enhance the model's adaptability to long-tailed data distributions.
arXiv Detail & Related papers (2024-03-13T09:24:59Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS)
In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems.
Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z) - Unlocking the Potential of Prompt-Tuning in Bridging Generalized and
Personalized Federated Learning [49.72857433721424]
Vision Transformers (ViT) and Visual Prompt Tuning (VPT) achieve state-of-the-art performance with improved efficiency in various computer vision tasks.
We present a novel algorithm, SGPT, that integrates Generalized FL (GFL) and Personalized FL (PFL) approaches by employing a unique combination of both shared and group-specific prompts.
arXiv Detail & Related papers (2023-10-27T17:22:09Z) - FedFM: Anchor-based Feature Matching for Data Heterogeneity in Federated
Learning [91.74206675452888]
We propose a novel method FedFM, which guides each client's features to match shared category-wise anchors.
To achieve higher efficiency and flexibility, we propose a FedFM variant, called FedFM-Lite, where clients communicate with server with fewer synchronization times and communication bandwidth costs.
arXiv Detail & Related papers (2022-10-14T08:11:34Z) - BlindFL: Vertical Federated Machine Learning without Peeking into Your
Data [20.048695060411774]
Vertical federated learning (VFL) describes a case where ML models are built upon the private data of different participated parties.
We introduce BlindFL, a novel framework for VFL training and inference.
We show that BlindFL supports diverse datasets and models efficiently whilst achieving robust privacy guarantees.
arXiv Detail & Related papers (2022-06-16T07:26:50Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.