FMLFS: A federated multi-label feature selection based on information theory in IoT environment
- URL: http://arxiv.org/abs/2405.00524v1
- Date: Wed, 1 May 2024 13:58:28 GMT
- Title: FMLFS: A federated multi-label feature selection based on information theory in IoT environment
- Authors: Afsaneh Mahanipour, Hana Khamfroush,
- Abstract summary: Internet-of-Things (IoT) devices generate or collect a huge amount of multi-label datasets.
The presence of noisy, redundant, or irrelevant features in these datasets, along with the curse of dimensionality, poses challenges for multi-label classifiers.
This paper introduces FMLFS, the first federated multi-label feature selection method.
- Score: 1.749521391198341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In certain emerging applications such as health monitoring wearable and traffic monitoring systems, Internet-of-Things (IoT) devices generate or collect a huge amount of multi-label datasets. Within these datasets, each instance is linked to a set of labels. The presence of noisy, redundant, or irrelevant features in these datasets, along with the curse of dimensionality, poses challenges for multi-label classifiers. Feature selection (FS) proves to be an effective strategy in enhancing classifier performance and addressing these challenges. Yet, there is currently no existing distributed multi-label FS method documented in the literature that is suitable for distributed multi-label datasets within IoT environments. This paper introduces FMLFS, the first federated multi-label feature selection method. Here, mutual information between features and labels serves as the relevancy metric, while the correlation distance between features, derived from mutual information and joint entropy, is utilized as the redundancy measure. Following aggregation of these metrics on the edge server and employing Pareto-based bi-objective and crowding distance strategies, the sorted features are subsequently sent back to the IoT devices. The proposed method is evaluated through two scenarios: 1) transmitting reduced-size datasets to the edge server for centralized classifier usage, and 2) employing federated learning with reduced-size datasets. Evaluation across three metrics - performance, time complexity, and communication cost - demonstrates that FMLFS outperforms five other comparable methods in the literature and provides a good trade-off on three real-world datasets.
Related papers
- Causal Multi-Label Feature Selection in Federated Setting [12.567019739483886]
We propose a Federated Causal Multi-label Feature Selection (FedCMFS) algorithm with three novel subroutines.
FedCMFS first uses the FedCFL subroutine that considers the correlations among label-label, label-feature, and feature-feature to learn the relevant features.
Second, FedCMFS employs the FedCFR subroutine to selectively recover the missed true relevant features.
arXiv Detail & Related papers (2024-03-11T04:11:48Z) - Empowering HWNs with Efficient Data Labeling: A Clustered Federated
Semi-Supervised Learning Approach [2.046985601687158]
Clustered Federated Multitask Learning (CFL) has gained considerable attention as an effective strategy for overcoming statistical challenges.
We introduce a novel framework, Clustered Federated Semi-Supervised Learning (CFSL), designed for more realistic HWN scenarios.
Our results demonstrate that CFSL significantly improves upon key metrics such as testing accuracy, labeling accuracy, and labeling latency under varying proportions of labeled and unlabeled data.
arXiv Detail & Related papers (2024-01-19T11:47:49Z) - Multi-label affordance mapping from egocentric vision [3.683202928838613]
We present a new approach to affordance perception which enables accurate multi-label segmentation.
Our approach can be used to automatically extract grounded affordances from first person videos.
We show how our metric representation can be exploited for build a map of interaction hotspots.
arXiv Detail & Related papers (2023-09-05T10:56:23Z) - A Unified One-Step Solution for Aspect Sentiment Quad Prediction [3.428123050377681]
Aspect sentiment quad prediction (ASQP) is a challenging yet significant subtask in aspect-based sentiment analysis.
We release two new datasets for ASQP, which contain the following characteristics: larger size, more words per sample, and higher density.
We propose a unified one-step solution for ASQP, namely One-ASQP, to detect the aspect categories and to identify the aspect-opinion-sentiment triplets simultaneously.
arXiv Detail & Related papers (2023-06-07T05:00:01Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Multi-view Multi-label Anomaly Network Traffic Classification based on
MLP-Mixer Neural Network [55.21501819988941]
Existing network traffic classification based on convolutional neural networks (CNNs) often emphasizes local patterns of traffic data while ignoring global information associations.
We propose an end-to-end network traffic classification method.
arXiv Detail & Related papers (2022-10-30T01:52:05Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - Dual-Refinement: Joint Label and Feature Refinement for Unsupervised
Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data.
We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase.
Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.