Sparse Linear Centroid-Encoder: A Convex Method for Feature Selection
- URL: http://arxiv.org/abs/2306.04824v2
- Date: Fri, 9 Jun 2023 04:06:36 GMT
- Title: Sparse Linear Centroid-Encoder: A Convex Method for Feature Selection
- Authors: Tomojit Ghosh, Michael Kirby, Karim Karimov
- Abstract summary: We present Sparse Centroid-Encoder (SLCE) over a novel feature selection technique.
The algorithm uses a linear network to reconstruct a neural feature at the same time.
- Score: 1.057079240576682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel feature selection technique, Sparse Linear
Centroid-Encoder (SLCE). The algorithm uses a linear transformation to
reconstruct a point as its class centroid and, at the same time, uses the
$\ell_1$-norm penalty to filter out unnecessary features from the input data.
The original formulation of the optimization problem is nonconvex, but we
propose a two-step approach, where each step is convex. In the first step, we
solve the linear Centroid-Encoder, a convex optimization problem over a matrix
$A$. In the second step, we only search for a sparse solution over a diagonal
matrix $B$ while keeping $A$ fixed. Unlike other linear methods, e.g., Sparse
Support Vector Machines and Lasso, Sparse Linear Centroid-Encoder uses a single
model for multi-class data. We present an in-depth empirical analysis of the
proposed model and show that it promotes sparsity on various data sets,
including high-dimensional biological data. Our experimental results show that
SLCE has a performance advantage over some state-of-the-art neural
network-based feature selection techniques.
Related papers
- Feature Selection using Sparse Adaptive Bottleneck Centroid-Encoder [1.2487990897680423]
We introduce a novel nonlinear model, Sparse Adaptive Bottleneckid-Encoder (SABCE), for determining the features that discriminate between two or more classes.
The algorithm is applied to various real-world data sets, including high-dimensional biological, image, speech, and accelerometer sensor data.
arXiv Detail & Related papers (2023-06-07T21:37:21Z) - Leverage Score Sampling for Tensor Product Matrices in Input Sparsity
Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices.
Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z) - Sparse Centroid-Encoder: A Nonlinear Model for Feature Selection [1.2487990897680423]
We develop a sparse implementation of the centroid-encoder for nonlinear data reduction and visualization called Centro Sparseid-Encoder.
We also provide a feature selection framework that first ranks each feature by its occurrence, and the optimal number of features is chosen using a validation set.
The algorithm is applied to a wide variety of data sets including, single-cell biological data, high dimensional infectious disease data, hyperspectral data, image data, and speech data.
arXiv Detail & Related papers (2022-01-30T20:46:24Z) - Nearly Optimal Linear Convergence of Stochastic Primal-Dual Methods for
Linear Programming [5.126924253766052]
We show that the proposed method exhibits a linear convergence rate for solving sharp instances with a high probability.
We also propose an efficient coordinate-based oracle for unconstrained bilinear problems.
arXiv Detail & Related papers (2021-11-10T04:56:38Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - On the Adversarial Robustness of LASSO Based Feature Selection [72.54211869067979]
In the considered model, there is a malicious adversary who can observe the whole dataset, and then will carefully modify the response values or the feature matrix.
We formulate the modification strategy of the adversary as a bi-level optimization problem.
Numerical examples with synthetic and real data illustrate that our method is efficient and effective.
arXiv Detail & Related papers (2020-10-20T05:51:26Z) - Robust Multi-class Feature Selection via $l_{2,0}$-Norm Regularization
Minimization [6.41804410246642]
Feature selection is an important computational-processing in data mining and machine learning.
In this paper, a novel method based on homoy hard threshold (HIHT) is proposed to solve the least square problem for multi-class feature selection.
arXiv Detail & Related papers (2020-10-08T02:06:06Z) - Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$.
We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z) - Efficient Algorithms for Multidimensional Segmented Regression [42.046881924063044]
We study the fundamental problem of fixed design em multidimensional regression.
We provide the first sample and computationally efficient algorithm for this problem in any fixed dimension.
Our algorithm relies on a simple merging iterative approach, which is novel in the multidimensional setting.
arXiv Detail & Related papers (2020-03-24T19:39:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.