Graph-based Extreme Feature Selection for Multi-class Classification
Tasks
- URL: http://arxiv.org/abs/2303.01792v1
- Date: Fri, 3 Mar 2023 09:06:35 GMT
- Title: Graph-based Extreme Feature Selection for Multi-class Classification
Tasks
- Authors: Shir Friedman, Gonen Singer, Neta Rabin
- Abstract summary: This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks.
We aim to drastically reduce the number of selected features, in order to create a sketch of the original data that codes valuable information for the classification task.
- Score: 7.863638253070439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When processing high-dimensional datasets, a common pre-processing step is
feature selection. Filter-based feature selection algorithms are not tailored
to a specific classification method, but rather rank the relevance of each
feature with respect to the target and the task. This work focuses on a
graph-based, filter feature selection method that is suited for multi-class
classifications tasks. We aim to drastically reduce the number of selected
features, in order to create a sketch of the original data that codes valuable
information for the classification task. The proposed graph-based algorithm is
constructed by combing the Jeffries-Matusita distance with a non-linear
dimension reduction method, diffusion maps. Feature elimination is performed
based on the distribution of the features in the low-dimensional space. Then, a
very small number of feature that have complementary separation strengths, are
selected. Moreover, the low-dimensional embedding allows to visualize the
feature space. Experimental results are provided for public datasets and
compared with known filter-based feature selection techniques.
Related papers
- Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Feature Selection Based on Orthogonal Constraints and Polygon Area [10.587608254638667]
The goal of feature selection is to choose the optimal subset of features for a recognition task by evaluating the importance of each feature.
This paper introduces a non-monotone linear search between dependencies enhancing features labels.
Experimental results demonstrate that our approach not only effectively captures discriminative dependency but also surpasses traditional methods in reducing dimensions classification performance.
arXiv Detail & Related papers (2024-02-25T08:20:05Z) - A Contrast Based Feature Selection Algorithm for High-dimensional Data
set in Machine Learning [9.596923373834093]
We propose a novel filter feature selection method, ContrastFS, which selects discriminative features based on the discrepancies features shown between different classes.
We validate effectiveness and efficiency of our approach on several widely studied benchmark datasets, results show that the new method performs favorably with negligible computation.
arXiv Detail & Related papers (2024-01-15T05:32:35Z) - A Performance-Driven Benchmark for Feature Selection in Tabular Deep
Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones.
Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance.
We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers.
We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z) - Graph-Based Automatic Feature Selection for Multi-Class Classification
via Mean Simplified Silhouette [4.786337974720721]
This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS)
The method determines the minimum combination of features required to sustain prediction performance.
It does not require any user-defined parameters such as the number of features to select.
arXiv Detail & Related papers (2023-09-05T14:37:31Z) - Parallel feature selection based on the trace ratio criterion [4.30274561163157]
This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST)
Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness.
The experiments show that our method can produce a small set of features in a fraction of the amount of time by the other methods under comparison.
arXiv Detail & Related papers (2022-03-03T10:50:33Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Multivariate feature ranking of gene expression data [62.997667081978825]
We propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency.
We statistically prove that the proposed methods outperform the state of the art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance.
arXiv Detail & Related papers (2021-11-03T17:19:53Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Adaptive Graph-based Generalized Regression Model for Unsupervised
Feature Selection [11.214334712819396]
How to select the uncorrelated and discriminative features is the key problem of unsupervised feature selection.
We present a novel generalized regression model imposed by an uncorrelated constraint and the $ell_2,1$-norm regularization.
It can simultaneously select the uncorrelated and discriminative features as well as reduce the variance of these data points belonging to the same neighborhood.
arXiv Detail & Related papers (2020-12-27T09:07:26Z) - Deep Learning feature selection to unhide demographic recommender
systems factors [63.732639864601914]
The matrix factorization model generates factors which do not incorporate semantic knowledge.
DeepUnHide is able to extract demographic information from the users and items factors in collaborative filtering recommender systems.
arXiv Detail & Related papers (2020-06-17T17:36:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.