Related papers: Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection

Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection

URL: http://arxiv.org/abs/2201.12910v1
Date: Sun, 30 Jan 2022 20:46:24 GMT
Title: Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection
Authors: Tomojit Ghosh and Michael Kirby
Abstract summary: We develop a sparse implementation of the centroid-encoder for nonlinear data reduction and visualization called Centro Sparseid-Encoder. We also provide a feature selection framework that first ranks each feature by its occurrence, and the optimal number of features is chosen using a validation set. The algorithm is applied to a wide variety of data sets including, single-cell biological data, high dimensional infectious disease data, hyperspectral data, image data, and speech data.
Score: 1.2487990897680423
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We develop a sparse optimization problem for the determination of the total set of features that discriminate two or more classes. This is a sparse implementation of the centroid-encoder for nonlinear data reduction and visualization called Sparse Centroid-Encoder (SCE). We also provide a feature selection framework that first ranks each feature by its occurrence, and the optimal number of features is chosen using a validation set. The algorithm is applied to a wide variety of data sets including, single-cell biological data, high dimensional infectious disease data, hyperspectral data, image data, and speech data. We compared our method to various state-of-the-art feature selection techniques, including two neural network-based models (DFS, and LassoNet), Sparse SVM, and Random Forest. We empirically showed that SCE features produced better classification accuracy on the unseen test data, often with fewer features.

Related papers

On the (In)Significance of Feature Selection in High-Dimensional Datasets [0.5266869303483376]
We test the null hypothesis of using randomly selected features to compare against features selected by FS algorithms.<n>Our results show that FS on high-dimensional datasets (in particular gene expression) in classification tasks is not useful.
arXiv Detail & Related papers (2025-08-05T15:58:31Z)
Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC) LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses. LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z)
Compact NSGA-II for Multi-objective Feature Selection [0.24578723416255746]
We define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features. In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm. To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection.
arXiv Detail & Related papers (2024-02-20T01:10:12Z)
Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding. The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data. We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z)
Feature Selection using Sparse Adaptive Bottleneck Centroid-Encoder [1.2487990897680423]
We introduce a novel nonlinear model, Sparse Adaptive Bottleneckid-Encoder (SABCE), for determining the features that discriminate between two or more classes. The algorithm is applied to various real-world data sets, including high-dimensional biological, image, speech, and accelerometer sensor data.
arXiv Detail & Related papers (2023-06-07T21:37:21Z)
Graph Convolutional Network-based Feature Selection for High-dimensional and Low-sample Size Data [4.266990593059533]
We present a deep learning-based method - GRAph Convolutional nEtwork feature Selector (GRACES) - to select important features for HDLSS data. We demonstrate empirical evidence that GRACES outperforms other feature selection methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-11-25T14:46:36Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features. Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z)
Optimal Data Selection: An Online Distributed View [61.31708750038692]
We develop algorithms for the online and distributed version of the problem. We show that our selection methods outperform random selection by $5-20%$. In learning tasks on ImageNet and MNIST, we show that our selection methods outperform random selection by $5-20%$.
arXiv Detail & Related papers (2022-01-25T18:56:16Z)
Cervical Cytology Classification Using PCA & GWO Enhanced Deep Features Selection [1.990876596716716]
Cervical cancer is one of the most deadly and common diseases among women worldwide. We propose a fully automated framework that utilizes Deep Learning and feature selection. The framework is evaluated on three publicly available benchmark datasets.
arXiv Detail & Related papers (2021-06-09T08:57:22Z)
Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning. The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned. Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z)
Joint Adaptive Graph and Structured Sparsity Regularization for Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method. A subset of optimal features will be selected in group, and the number of selected features will be determined automatically. Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.