An Online Sparse Streaming Feature Selection Algorithm
- URL: http://arxiv.org/abs/2208.01562v2
- Date: Wed, 3 Aug 2022 14:34:11 GMT
- Title: An Online Sparse Streaming Feature Selection Algorithm
- Authors: Feilong Chen, Di Wu, Jie Yang, Yi He
- Abstract summary: We propose an online sparse streaming feature selection algorithm with uncertainty (OS2FSU)
OS2FSU consists of two main parts: 1) latent factor analysis is utilized to pre-estimate the missing data in sparse streaming features before con-ducting feature selection, and 2) fuzzy logic and neighborhood rough set are employed to alleviate the uncertainty between estimated streaming features and labels during conducting feature selection.
Results demonstrate that OS2FSU outperforms its competitors when missing data are encountered in OSFS.
- Score: 14.414813893419506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online streaming feature selection (OSFS), which conducts feature selection
in an online manner, plays an important role in dealing with high-dimensional
data. In many real applications such as intelligent healthcare platform,
streaming feature always has some missing data, which raises a crucial
challenge in conducting OSFS, i.e., how to establish the uncertain relationship
between sparse streaming features and labels. Unfortunately, existing OSFS
algorithms never consider such uncertain relationship. To fill this gap, we in
this paper propose an online sparse streaming feature selection with
uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent
factor analysis is utilized to pre-estimate the missing data in sparse
streaming features before con-ducting feature selection, and 2) fuzzy logic and
neighborhood rough set are employed to alleviate the uncertainty between
estimated streaming features and labels during conducting feature selection. In
the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms
on six real datasets. The results demonstrate that OS2FSU outperforms its
competitors when missing data are encountered in OSFS.
Related papers
- Online Sparse Feature Selection in Data Streams via Differential Evolution [2.03725086642376]
This paper introduces a novel Online Differential Evolution for Sparse Feature Selection (ODESFS) in data streams.<n>Experiments conducted on six real-world datasets demonstrate that ODESFS consistently outperforms state-of-the-art OSFS and OS2FS methods.
arXiv Detail & Related papers (2025-11-24T14:19:51Z) - Particle swarm optimization for online sparse streaming feature selection under uncertainty [2.03725086642376]
In real-world applications involving high-dimensional streaming data, online streaming feature selection (OSFS) is widely adopted.<n>This work proposes POS2FS-an uncertainty-aware online sparse streaming feature selection framework enhanced by particle swarm optimization (PSO)<n>The approach introduces: 1) PSO-driven supervision to reduce uncertainty in feature-label relationships; 2) Three-way decision theory to manage feature fuzziness in supervised learning.
arXiv Detail & Related papers (2025-08-24T07:56:41Z) - Online Decision-Focused Learning [74.3205104323777]
Decision-focused learning (DFL) is an increasingly popular paradigm for training models whose predictive outputs are used in decision-making tasks.<n>In this paper, we regularize the objective function to make it different and investigate how to overcome nonoptimality function.<n>We also showcase the effectiveness of our algorithms on a knapsack experiment, where they outperform two standard benchmarks.
arXiv Detail & Related papers (2025-05-19T10:40:30Z) - ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams [57.080448177724264]
Video-text data presents challenges in storage and computation during training.
We propose Relevance and Specificity-based online filtering framework (ReSpec)
By establishing reference points from target task data, ReSpec filters incoming data in real-time, eliminating the need for extensive storage and compute.
arXiv Detail & Related papers (2025-04-21T06:02:03Z) - GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation [0.0]
We present GFSNetwork, a novel neural architecture that performs differentiable feature selection through Gumbel-Sigmoid sampling.
We evaluate GFSNetwork on a series of classification and regression benchmarks, where it consistently outperforms recent methods.
We validate our approach on real-world metagenomic datasets, demonstrating its effectiveness in high-dimensional biological data.
arXiv Detail & Related papers (2025-03-17T15:47:26Z) - Fairness-Aware Streaming Feature Selection with Causal Graphs [10.644488289941021]
Streaming Feature Selection with Causal Fairness builds causal graphs egocentric to prediction label and protected feature.
We benchmark SFCF on five datasets widely used in streaming feature research.
arXiv Detail & Related papers (2024-08-17T00:41:02Z) - Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Fair Streaming Feature Selection [9.327911386140109]
We propose FairSFS, a novel algorithm for fair streaming feature selection.
We show that FairSFS not only maintains accuracy that is on par with leading streaming feature selection methods but also significantly improves fairness metrics.
arXiv Detail & Related papers (2024-06-20T15:22:44Z) - FSDR: A Novel Deep Learning-based Feature Selection Algorithm for Pseudo
Time-Series Data using Discrete Relaxation [9.769546018094665]
We introduce a Deep Learning-based feature selection algorithm: Feature Selection through Discrete Relaxation (FSDR)
FSDR is capable of accommodating a high number of feature dimensions, a capability beyond the reach of existing DL-based or traditional methods.
arXiv Detail & Related papers (2024-03-13T10:37:52Z) - Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS)
In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems.
Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Online Sparse Streaming Feature Selection Using Adapted Classification [5.587715545506331]
Existing methods divide features into relevance or irrelevance without missing data.
We propose online sparse streaming feature selection based on adapted classification (OS2FS-AC)
Experimental results on ten real-world data sets demonstrate that OS2FS-AC performs better than state-of-the-art algo-rithms.
arXiv Detail & Related papers (2023-02-25T03:03:53Z) - NP-Match: When Neural Processes meet Semi-Supervised Learning [133.009621275051]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data.
In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.
arXiv Detail & Related papers (2022-07-03T15:24:31Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Online Feature Selection for Efficient Learning in Networked Systems [3.13468877208035]
Current AI/ML methods for data-driven engineering use models that are mostly trained offline.
We present an online algorithm called Online Stable Feature Set Algorithm (OSFS), which selects a small feature set from a large number of available data sources.
OSFS achieves a massive reduction in the size of the feature set by 1-3 orders of magnitude on all investigated datasets.
arXiv Detail & Related papers (2021-12-15T16:31:59Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.