Semi-Supervised Federated Multi-Label Feature Selection with Fuzzy Information Measures
- URL: http://arxiv.org/abs/2511.17796v1
- Date: Fri, 21 Nov 2025 21:41:15 GMT
- Title: Semi-Supervised Federated Multi-Label Feature Selection with Fuzzy Information Measures
- Authors: Afsaneh Mahanipour, Hana Khamfroush,
- Abstract summary: Multi-label feature selection (FS) reduces the dimensionality of multi-label data by removing irrelevant, noisy, and redundant features.<n>Existing methods typically require centralized data, which makes them unsuitable for distributed and federated environments.<n>We propose a Semi-Supervised Federated Multi-Label Feature Selection method, called SSFMLFS, where clients hold only unlabeled data, while the server has limited labeled data.
- Score: 1.5870512175595712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label feature selection (FS) reduces the dimensionality of multi-label data by removing irrelevant, noisy, and redundant features, thereby boosting the performance of multi-label learning models. However, existing methods typically require centralized data, which makes them unsuitable for distributed and federated environments where each device/client holds its own local dataset. Additionally, federated methods often assume that clients have labeled data, which is unrealistic in cases where clients lack the expertise or resources to label task-specific data. To address these challenges, we propose a Semi-Supervised Federated Multi-Label Feature Selection method, called SSFMLFS, where clients hold only unlabeled data, while the server has limited labeled data. SSFMLFS adapts fuzzy information theory to a federated setting, where clients compute fuzzy similarity matrices and transmit them to the server, which then calculates feature redundancy and feature-label relevancy degrees. A feature graph is constructed by modeling features as vertices, assigning relevancy and redundancy degrees as vertex weights and edge weights, respectively. PageRank is then applied to rank the features by importance. Extensive experiments on five real-world datasets from various domains, including biology, images, music, and text, demonstrate that SSFMLFS outperforms other federated and centralized supervised and semi-supervised approaches in terms of three different evaluation metrics in non-IID data distribution setting.
Related papers
- Neural Collapse-Inspired Multi-Label Federated Learning under Label-Distribution Skew [13.59179877525954]
Federated Learning (FL) enables collaborative model training across distributed clients.<n>Many real-world applications, particularly in domains such as medical imaging, often involve multi-label settings.<n>We propose a method to align feature distributions across clients and to learn high-quality, well-clustered representations.
arXiv Detail & Related papers (2025-09-16T00:53:11Z) - GPMFS: Global Foundation and Personalized Optimization for Multi-Label Feature Selection [0.820217860574125]
The curse of dimensionality remains one of the major bottlenecks in high-dimensional multi-label learning.<n>We propose a novel method called GPMFS (Global Foundation and Personalized Optimization for Multi-Label Feature Selection)<n>Experiments on multiple real-world datasets demonstrate that GPMFS achieves superior performance while maintaining strong interpretability and robustness.
arXiv Detail & Related papers (2025-04-17T08:29:14Z) - Boosting Federated Learning with FedEntOpt: Mitigating Label Skew by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance.<n>FedEntOpt is designed to mitigate performance issues caused by label distribution skew.<n>It exhibits robust and superior performance in scenarios with low participation rates and client dropout.
arXiv Detail & Related papers (2024-11-02T13:31:36Z) - FMLFS: A Federated Multi-Label Feature Selection Based on Information Theory in IoT Environment [1.749521391198341]
Internet-of-Things (IoT) devices generate or collect a huge amount of multi-label datasets.
The presence of noisy, redundant, or irrelevant features in these datasets, along with the curse of dimensionality, poses challenges for multi-label classifiers.
This paper introduces FMLFS, the first federated multi-label feature selection method.
arXiv Detail & Related papers (2024-05-01T13:58:28Z) - Causal Multi-Label Feature Selection in Federated Setting [11.713722451298171]
We propose a Federated Causal Multi-label Feature Selection (FedCMFS) algorithm with three novel subroutines.
FedCMFS first uses the FedCFL subroutine that considers the correlations among label-label, label-feature, and feature-feature to learn the relevant features.
Second, FedCMFS employs the FedCFR subroutine to selectively recover the missed true relevant features.
arXiv Detail & Related papers (2024-03-11T04:11:48Z) - Federated Deep Multi-View Clustering with Global Self-Supervision [51.639891178519136]
Federated multi-view clustering has the potential to learn a global clustering model from data distributed across multiple devices.
In this setting, label information is unknown and data privacy must be preserved.
We propose a novel federated deep multi-view clustering method that can mine complementary cluster structures from multiple clients.
arXiv Detail & Related papers (2023-09-24T17:07:01Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy.
We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server.
We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - Federated Unsupervised Representation Learning [56.715917111878106]
We formulate a new problem in federated learning called Federated Unsupervised Representation Learning (FURL) to learn a common representation model without supervision.
FedCA is composed of two key modules: dictionary module to aggregate the representations of samples from each client and share with all clients for consistency of representation space and alignment module to align the representation of each client on a base model trained on a public data.
arXiv Detail & Related papers (2020-10-18T13:28:30Z) - Federated Semi-Supervised Learning with Inter-Client Consistency &
Disjoint Learning [78.88007892742438]
We study two essential scenarios of Federated Semi-Supervised Learning (FSSL) based on the location of the labeled data.
We propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch)
arXiv Detail & Related papers (2020-06-22T09:43:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.