Related papers: Neural Greedy Pursuit for Feature Selection

Neural Greedy Pursuit for Feature Selection

URL: http://arxiv.org/abs/2207.09390v1
Date: Tue, 19 Jul 2022 16:39:16 GMT
Title: Neural Greedy Pursuit for Feature Selection
Authors: Sandipan Das, Alireza M. Javid, Prakash Borpatra Gohain, Yonina C. Eldar, Saikat Chatterjee
Abstract summary: We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem. We use neural networks as predictors in the algorithm to compute the loss.
Score: 72.4121881681861
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem. The features are selected one by one sequentially, in an iterative loss minimization procedure. We use neural networks as predictors in the algorithm to compute the loss and hence, we refer to our method as neural greedy pursuit (NGP). NGP is efficient in selecting $N$ features when $N \ll P$, and it provides a notion of feature importance in a descending order following the sequential selection procedure. We experimentally show that NGP provides better performance than several feature selection methods such as DeepLIFT and Drop-one-out loss. In addition, we experimentally show a phase transition behavior in which perfect selection of all $N$ features without false positives is possible when the training data size exceeds a threshold.

Related papers

SAND: One-Shot Feature Selection with Additive Noise Distortion [3.5976830118932583]
We introduce a novel, non-intrusive feature selection layer that automatically identifies and selects the $k$ most informative features during neural network training.<n>Our method is uniquely simple, requiring no alterations to the loss function, network architecture, or post-selection retraining.<n>Our work demonstrates that simplicity and performance are not mutually exclusive, offering a powerful yet straightforward tool for feature selection in machine learning.
arXiv Detail & Related papers (2025-05-06T18:59:35Z)
Data Selection for ERMs [67.57726352698933]
We study how well can $mathcalA$ perform when trained on at most $nll N$ data points selected from a population of $N$ points. Our results include optimal data-selection bounds for mean estimation, linear classification, and linear regression.
arXiv Detail & Related papers (2025-04-20T11:26:01Z)
Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks [17.12834153477201]
We propose a novel resource-efficient supervised feature selection method using sparse neural networks. By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently. NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models.
arXiv Detail & Related papers (2023-03-10T17:09:55Z)
Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm. Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z)
Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy. To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN. Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z)
Sequential Attention for Feature Selection [12.89764845700709]
We propose a feature selection algorithm called Sequential Attention that achieves state-of-the-art empirical results for neural networks. We give theoretical insights into our algorithm for linear regression by showing that an adaptation to this setting is equivalent to the classical Orthogonal Matching Pursuit (OMP) algorithm.
arXiv Detail & Related papers (2022-09-29T15:49:06Z)
Lockout: Sparse Regularization of Neural Networks [0.0]
Regularization is applied to improve accuracy by placing a constraint $P(w)leq t$ on the values of the parameters $w$. We present a fast algorithm that provides all such solutions for any differentiable function $f$ and loss $L$, and any constraint $P$ that is an increasing monotone function of the absolute value of each parameter.
arXiv Detail & Related papers (2021-07-15T07:17:20Z)
Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function Approximation [92.3161051419884]
We study the reinforcement learning for finite-horizon episodic Markov decision processes with adversarial reward and full information feedback. We show that it can achieve $tildeO(dHsqrtT)$ regret, where $H$ is the length of the episode. We also prove a matching lower bound of $tildeOmega(dHsqrtT)$ up to logarithmic factors.
arXiv Detail & Related papers (2021-02-17T18:54:08Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions. In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
Resolving learning rates adaptively by locating Stochastic Non-Negative Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning. This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.