Neural Greedy Pursuit for Feature Selection
- URL: http://arxiv.org/abs/2207.09390v1
- Date: Tue, 19 Jul 2022 16:39:16 GMT
- Title: Neural Greedy Pursuit for Feature Selection
- Authors: Sandipan Das, Alireza M. Javid, Prakash Borpatra Gohain, Yonina C.
Eldar, Saikat Chatterjee
- Abstract summary: We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem.
We use neural networks as predictors in the algorithm to compute the loss.
- Score: 72.4121881681861
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a greedy algorithm to select $N$ important features among $P$
input features for a non-linear prediction problem. The features are selected
one by one sequentially, in an iterative loss minimization procedure. We use
neural networks as predictors in the algorithm to compute the loss and hence,
we refer to our method as neural greedy pursuit (NGP). NGP is efficient in
selecting $N$ features when $N \ll P$, and it provides a notion of feature
importance in a descending order following the sequential selection procedure.
We experimentally show that NGP provides better performance than several
feature selection methods such as DeepLIFT and Drop-one-out loss. In addition,
we experimentally show a phase transition behavior in which perfect selection
of all $N$ features without false positives is possible when the training data
size exceeds a threshold.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Supervised Feature Selection with Neuron Evolution in Sparse Neural
Networks [17.12834153477201]
We propose a novel resource-efficient supervised feature selection method using sparse neural networks.
By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently.
NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models.
arXiv Detail & Related papers (2023-03-10T17:09:55Z) - Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy.
To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN.
Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z) - Sequential Attention for Feature Selection [12.89764845700709]
We propose a feature selection algorithm called Sequential Attention that achieves state-of-the-art empirical results for neural networks.
We give theoretical insights into our algorithm for linear regression by showing that an adaptation to this setting is equivalent to the classical Orthogonal Matching Pursuit (OMP) algorithm.
arXiv Detail & Related papers (2022-09-29T15:49:06Z) - Lockout: Sparse Regularization of Neural Networks [0.0]
Regularization is applied to improve accuracy by placing a constraint $P(w)leq t$ on the values of the parameters $w$.
We present a fast algorithm that provides all such solutions for any differentiable function $f$ and loss $L$, and any constraint $P$ that is an increasing monotone function of the absolute value of each parameter.
arXiv Detail & Related papers (2021-07-15T07:17:20Z) - Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function
Approximation [92.3161051419884]
We study the reinforcement learning for finite-horizon episodic Markov decision processes with adversarial reward and full information feedback.
We show that it can achieve $tildeO(dHsqrtT)$ regret, where $H$ is the length of the episode.
We also prove a matching lower bound of $tildeOmega(dHsqrtT)$ up to logarithmic factors.
arXiv Detail & Related papers (2021-02-17T18:54:08Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z) - Resolving learning rates adaptively by locating Stochastic Non-Negative
Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning.
This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.