Causal Multi-Label Feature Selection in Federated Setting
- URL: http://arxiv.org/abs/2403.06419v2
- Date: Mon, 26 Aug 2024 21:09:57 GMT
- Title: Causal Multi-Label Feature Selection in Federated Setting
- Authors: Yukun Song, Dayuan Cao, Jiali Miao, Shuai Yang, Kui Yu,
- Abstract summary: We propose a Federated Causal Multi-label Feature Selection (FedCMFS) algorithm with three novel subroutines.
FedCMFS first uses the FedCFL subroutine that considers the correlations among label-label, label-feature, and feature-feature to learn the relevant features.
Second, FedCMFS employs the FedCFR subroutine to selectively recover the missed true relevant features.
- Score: 11.713722451298171
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label feature selection serves as an effective mean for dealing with high-dimensional multi-label data. To achieve satisfactory performance, existing methods for multi-label feature selection often require the centralization of substantial data from multiple sources. However, in Federated setting, centralizing data from all sources and merging them into a single dataset is not feasible. To tackle this issue, in this paper, we study a challenging problem of causal multi-label feature selection in federated setting and propose a Federated Causal Multi-label Feature Selection (FedCMFS) algorithm with three novel subroutines. Specifically, FedCMFS first uses the FedCFL subroutine that considers the correlations among label-label, label-feature, and feature-feature to learn the relevant features (candidate parents and children) of each class label while preserving data privacy without centralizing data. Second, FedCMFS employs the FedCFR subroutine to selectively recover the missed true relevant features. Finally, FedCMFS utilizes the FedCFC subroutine to remove false relevant features. The extensive experiments on 8 datasets have shown that FedCMFS is effect for causal multi-label feature selection in federated setting.
Related papers
- FMLFS: A Federated Multi-Label Feature Selection Based on Information Theory in IoT Environment [1.749521391198341]
Internet-of-Things (IoT) devices generate or collect a huge amount of multi-label datasets.
The presence of noisy, redundant, or irrelevant features in these datasets, along with the curse of dimensionality, poses challenges for multi-label classifiers.
This paper introduces FMLFS, the first federated multi-label feature selection method.
arXiv Detail & Related papers (2024-05-01T13:58:28Z) - Embedded Multi-label Feature Selection via Orthogonal Regression [45.55795914923279]
State-of-the-art embedded multi-label feature selection algorithms based on at least square regression cannot preserve sufficient discriminative information in multi-label data.
A novel embedded multi-label feature selection method is proposed to facilitate the multi-label feature selection.
Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.
arXiv Detail & Related papers (2024-03-01T06:18:40Z) - MvFS: Multi-view Feature Selection for Recommender System [7.0190343591422115]
We propose Multi-view Feature Selection (MvFS), which selects informative features for each instance more effectively.
MvFS employs a multi-view network consisting of multiple sub-networks, each of which learns to measure the feature importance of a part of data.
MvFS adopts an effective importance score modeling strategy which is applied independently to each field.
arXiv Detail & Related papers (2023-09-05T09:06:34Z) - FedWon: Triumphing Multi-domain Federated Learning Without Normalization [50.49210227068574]
Federated learning (FL) enhances data privacy with collaborative in-situ training on decentralized clients.
However, Federated learning (FL) encounters challenges due to non-independent and identically distributed (non-i.i.d) data.
We propose a novel method called Federated learning Without normalizations (FedWon) to address the multi-domain problem in FL.
arXiv Detail & Related papers (2023-06-09T13:18:50Z) - FLAG: Fast Label-Adaptive Aggregation for Multi-label Classification in
Federated Learning [1.4280238304844592]
This study proposes a new multi-label federated learning framework with a Clustering-based Multi-label Data Allocation (CMDA) and a novel aggregation method, Fast Label-Adaptive Aggregation (FLAG)
The experimental results demonstrate that our methods only need less than 50% of training epochs and communication rounds to surpass the performance of state-of-the-art federated learning methods.
arXiv Detail & Related papers (2023-02-27T08:16:39Z) - FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated
Learning [21.79965380400454]
Vertical Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s) to jointly train a useful global model.
Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features.
We propose the Federated Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian dual-gate to efficiently approximate the probability of a feature being selected, with privacy
arXiv Detail & Related papers (2023-02-21T03:09:45Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - FedFM: Anchor-based Feature Matching for Data Heterogeneity in Federated
Learning [91.74206675452888]
We propose a novel method FedFM, which guides each client's features to match shared category-wise anchors.
To achieve higher efficiency and flexibility, we propose a FedFM variant, called FedFM-Lite, where clients communicate with server with fewer synchronization times and communication bandwidth costs.
arXiv Detail & Related papers (2022-10-14T08:11:34Z) - A Supervised Feature Selection Method For Mixed-Type Data using
Density-based Feature Clustering [1.3048920509133808]
This paper proposes a supervised feature selection method using density-based feature clustering (SFSDFC)
SFSDFC decomposes the feature space into a set of disjoint feature clusters using a novel density-based clustering method.
Then, an effective feature selection strategy is employed to obtain a subset of important features with minimal redundancy from those feature clusters.
arXiv Detail & Related papers (2021-11-10T15:05:15Z) - Federated Mixture of Experts [94.25278695272874]
FedMix is a framework that allows us to train an ensemble of specialized models.
We show that users with similar data characteristics select the same members and therefore share statistical strength.
arXiv Detail & Related papers (2021-07-14T14:15:24Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.