Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection
- URL: http://arxiv.org/abs/2201.12910v1
- Date: Sun, 30 Jan 2022 20:46:24 GMT
- Title: Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection
- Authors: Tomojit Ghosh and Michael Kirby
- Abstract summary: We develop a sparse implementation of the centroid-encoder for nonlinear data reduction and visualization called Centro Sparseid-Encoder.
We also provide a feature selection framework that first ranks each feature by its occurrence, and the optimal number of features is chosen using a validation set.
The algorithm is applied to a wide variety of data sets including, single-cell biological data, high dimensional infectious disease data, hyperspectral data, image data, and speech data.
- Score: 1.2487990897680423
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We develop a sparse optimization problem for the determination of the total
set of features that discriminate two or more classes. This is a sparse
implementation of the centroid-encoder for nonlinear data reduction and
visualization called Sparse Centroid-Encoder (SCE). We also provide a feature
selection framework that first ranks each feature by its occurrence, and the
optimal number of features is chosen using a validation set. The algorithm is
applied to a wide variety of data sets including, single-cell biological data,
high dimensional infectious disease data, hyperspectral data, image data, and
speech data. We compared our method to various state-of-the-art feature
selection techniques, including two neural network-based models (DFS, and
LassoNet), Sparse SVM, and Random Forest. We empirically showed that SCE
features produced better classification accuracy on the unseen test data, often
with fewer features.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Compact NSGA-II for Multi-objective Feature Selection [0.24578723416255746]
We define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features.
In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm.
To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection.
arXiv Detail & Related papers (2024-02-20T01:10:12Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Feature Selection using Sparse Adaptive Bottleneck Centroid-Encoder [1.2487990897680423]
We introduce a novel nonlinear model, Sparse Adaptive Bottleneckid-Encoder (SABCE), for determining the features that discriminate between two or more classes.
The algorithm is applied to various real-world data sets, including high-dimensional biological, image, speech, and accelerometer sensor data.
arXiv Detail & Related papers (2023-06-07T21:37:21Z) - Graph Convolutional Network-based Feature Selection for High-dimensional
and Low-sample Size Data [4.266990593059533]
We present a deep learning-based method - GRAph Convolutional nEtwork feature Selector (GRACES) - to select important features for HDLSS data.
We demonstrate empirical evidence that GRACES outperforms other feature selection methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-11-25T14:46:36Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Optimal Data Selection: An Online Distributed View [61.31708750038692]
We develop algorithms for the online and distributed version of the problem.
We show that our selection methods outperform random selection by $5-20%$.
In learning tasks on ImageNet and MNIST, we show that our selection methods outperform random selection by $5-20%$.
arXiv Detail & Related papers (2022-01-25T18:56:16Z) - Cervical Cytology Classification Using PCA & GWO Enhanced Deep Features
Selection [1.990876596716716]
Cervical cancer is one of the most deadly and common diseases among women worldwide.
We propose a fully automated framework that utilizes Deep Learning and feature selection.
The framework is evaluated on three publicly available benchmark datasets.
arXiv Detail & Related papers (2021-06-09T08:57:22Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Joint Adaptive Graph and Structured Sparsity Regularization for
Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method.
A subset of optimal features will be selected in group, and the number of selected features will be determined automatically.
Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.