Related papers: Graph-Based Automatic Feature Selection for Multi-Class Classification via Mean Simplified Silhouette

Graph-Based Automatic Feature Selection for Multi-Class Classification via Mean Simplified Silhouette

URL: http://arxiv.org/abs/2309.02272v1
Date: Tue, 5 Sep 2023 14:37:31 GMT
Title: Graph-Based Automatic Feature Selection for Multi-Class Classification via Mean Simplified Silhouette
Authors: David Levin, Gonen Singer
Abstract summary: This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS) The method determines the minimum combination of features required to sustain prediction performance. It does not require any user-defined parameters such as the number of features to select.
Score: 4.786337974720721
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS) for multi-class classification tasks. The method determines the minimum combination of features required to sustain prediction performance while maintaining complementary discriminating abilities between different classes. It does not require any user-defined parameters such as the number of features to select. The methodology employs the Jeffries-Matusita (JM) distance in conjunction with t-distributed Stochastic Neighbor Embedding (t-SNE) to generate a low-dimensional space reflecting how effectively each feature can differentiate between each pair of classes. The minimum number of features is selected using our newly developed Mean Simplified Silhouette (abbreviated as MSS) index, designed to evaluate the clustering results for the feature selection task. Experimental results on public data sets demonstrate the superior performance of the proposed GB-AFS over other filter-based techniques and automatic feature selection approaches. Moreover, the proposed algorithm maintained the accuracy achieved when utilizing all features, while using only $7\%$ to $30\%$ of the features. Consequently, this resulted in a reduction of the time needed for classifications, from $15\%$ to $70\%$.

Related papers

Feature Selection for Latent Factor Models [2.07180164747172]
Feature selection is crucial for pinpointing relevant features in high-dimensional datasets. Traditional feature selection methods for classification use data from all classes to select features for each class. This paper explores feature selection methods that select features for each class separately, using class models based on low-rank generative methods.
arXiv Detail & Related papers (2024-12-13T13:20:10Z)
FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Existing methods have proposed to select a subset of unlabeled examples for annotation. We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z)
Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses. Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z)
A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance. We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers. We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z)
Graph-based Extreme Feature Selection for Multi-class Classification Tasks [7.863638253070439]
This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks. We aim to drastically reduce the number of selected features, in order to create a sketch of the original data that codes valuable information for the classification task.
arXiv Detail & Related papers (2023-03-03T09:06:35Z)
Model-free feature selection to facilitate automatic discovery of divergent subgroups in tabular data [4.551615447454768]
We propose a model-free and sparsity-based automatic feature selection (SAFS) framework to facilitate automatic discovery of divergent subgroups. We validated SAFS across two publicly available datasets (MIMIC-III and Allstate Claims) and compared it with six existing feature selection methods.
arXiv Detail & Related papers (2022-03-08T20:42:56Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features. Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z)
On the utility of power spectral techniques with feature selection techniques for effective mental task classification in noninvasive BCI [19.19039983741124]
This paper proposes an approach to select relevant and non-redundant spectral features for the mental task classification. The findings demonstrate substantial improvements in the performance of the learning model for mental task classification.
arXiv Detail & Related papers (2021-11-16T00:27:53Z)
Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z)
Joint Adaptive Graph and Structured Sparsity Regularization for Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method. A subset of optimal features will be selected in group, and the number of selected features will be determined automatically. Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z)
Self-Supervised Tuning for Few-Shot Segmentation [82.32143982269892]
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples. Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space. This paper presents an adaptive framework tuning, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme.
arXiv Detail & Related papers (2020-04-12T03:53:53Z)
Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches. We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.