Graph-Based Automatic Feature Selection for Multi-Class Classification
via Mean Simplified Silhouette
- URL: http://arxiv.org/abs/2309.02272v1
- Date: Tue, 5 Sep 2023 14:37:31 GMT
- Title: Graph-Based Automatic Feature Selection for Multi-Class Classification
via Mean Simplified Silhouette
- Authors: David Levin, Gonen Singer
- Abstract summary: This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS)
The method determines the minimum combination of features required to sustain prediction performance.
It does not require any user-defined parameters such as the number of features to select.
- Score: 4.786337974720721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel graph-based filter method for automatic feature
selection (abbreviated as GB-AFS) for multi-class classification tasks. The
method determines the minimum combination of features required to sustain
prediction performance while maintaining complementary discriminating abilities
between different classes. It does not require any user-defined parameters such
as the number of features to select. The methodology employs the
Jeffries-Matusita (JM) distance in conjunction with t-distributed Stochastic
Neighbor Embedding (t-SNE) to generate a low-dimensional space reflecting how
effectively each feature can differentiate between each pair of classes. The
minimum number of features is selected using our newly developed Mean
Simplified Silhouette (abbreviated as MSS) index, designed to evaluate the
clustering results for the feature selection task. Experimental results on
public data sets demonstrate the superior performance of the proposed GB-AFS
over other filter-based techniques and automatic feature selection approaches.
Moreover, the proposed algorithm maintained the accuracy achieved when
utilizing all features, while using only $7\%$ to $30\%$ of the features.
Consequently, this resulted in a reduction of the time needed for
classifications, from $15\%$ to $70\%$.
Related papers
- FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts.
Existing methods have proposed to select a subset of unlabeled examples for annotation.
We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z) - Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - A Performance-Driven Benchmark for Feature Selection in Tabular Deep
Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones.
Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance.
We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers.
We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z) - Graph-based Extreme Feature Selection for Multi-class Classification
Tasks [7.863638253070439]
This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks.
We aim to drastically reduce the number of selected features, in order to create a sketch of the original data that codes valuable information for the classification task.
arXiv Detail & Related papers (2023-03-03T09:06:35Z) - Model-free feature selection to facilitate automatic discovery of
divergent subgroups in tabular data [4.551615447454768]
We propose a model-free and sparsity-based automatic feature selection (SAFS) framework to facilitate automatic discovery of divergent subgroups.
We validated SAFS across two publicly available datasets (MIMIC-III and Allstate Claims) and compared it with six existing feature selection methods.
arXiv Detail & Related papers (2022-03-08T20:42:56Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - On the utility of power spectral techniques with feature selection
techniques for effective mental task classification in noninvasive BCI [19.19039983741124]
This paper proposes an approach to select relevant and non-redundant spectral features for the mental task classification.
The findings demonstrate substantial improvements in the performance of the learning model for mental task classification.
arXiv Detail & Related papers (2021-11-16T00:27:53Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Joint Adaptive Graph and Structured Sparsity Regularization for
Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method.
A subset of optimal features will be selected in group, and the number of selected features will be determined automatically.
Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z) - Self-Supervised Tuning for Few-Shot Segmentation [82.32143982269892]
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.
Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space.
This paper presents an adaptive framework tuning, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme.
arXiv Detail & Related papers (2020-04-12T03:53:53Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.