Related papers: EntryPrune: Neural Network Feature Selection using First Impressions

EntryPrune: Neural Network Feature Selection using First Impressions

URL: http://arxiv.org/abs/2410.02344v3
Date: Tue, 20 May 2025 14:48:57 GMT
Title: EntryPrune: Neural Network Feature Selection using First Impressions
Authors: Felix Zimmer, Patrik Okanovic, Torsten Hoefler,
Abstract summary: EntryPrune is a novel supervised feature selection algorithm using a dense neural network with a dynamic sparse input layer.<n>It employs entry-based pruning, a novel approach that compares neurons based on their relative change induced when they have entered the network.
Score: 19.217750941193472
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is an ongoing effort to develop feature selection algorithms to improve interpretability, reduce computational resources, and minimize overfitting in predictive models. Neural networks stand out as architectures on which to build feature selection methods, and recently, neuron pruning and regrowth have emerged from the sparse neural network literature as promising new tools. We introduce EntryPrune, a novel supervised feature selection algorithm using a dense neural network with a dynamic sparse input layer. It employs entry-based pruning, a novel approach that compares neurons based on their relative change induced when they have entered the network. Extensive experiments on 13 different datasets show that our approach generally outperforms the current state-of-the-art methods, and in particular improves the average accuracy on low-dimensional datasets. Furthermore, we show that EntryPruning surpasses traditional techniques such as magnitude pruning within the EntryPrune framework and that EntryPrune achieves lower runtime than competing approaches. Our code is available at https://github.com/flxzimmer/entryprune.

Related papers

Adaptive Pruning of Deep Neural Networks for Resource-Aware Embedded Intrusion Detection on the Edge [43.03813603637526]
We analyze the ability of a selection of artificial neural network pruning methods to generalize to a new cybersecurity dataset.<n>We have found that many of them do not generalize to the problem well, leaving only a few algorithms working to an acceptable degree.
arXiv Detail & Related papers (2025-05-20T16:45:54Z)
Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks [2.186901738997927]
For regression tasks, standard Gaussian processes (GPs) and deep neural networks (DNNs) provide natural uncertainty quantification (UQ) We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse. DVE is compatible with pretrained networks and incurs low computational overhead.
arXiv Detail & Related papers (2023-05-26T16:19:26Z)
Optimal rates of approximation by shallow ReLU$^k$ neural networks and applications to nonparametric regression [12.21422686958087]
We study the approximation capacity of some variation spaces corresponding to shallow ReLU$k$ neural networks. For functions with less smoothness, the approximation rates in terms of the variation norm are established. We show that shallow neural networks can achieve the minimax optimal rates for learning H"older functions.
arXiv Detail & Related papers (2023-04-04T06:35:02Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks [17.12834153477201]
We propose a novel resource-efficient supervised feature selection method using sparse neural networks. By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently. NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models.
arXiv Detail & Related papers (2023-03-10T17:09:55Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks. We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z)
Optimal Learning Rates of Deep Convolutional Neural Networks: Additive Ridge Functions [19.762318115851617]
We consider the mean squared error analysis for deep convolutional neural networks. We show that, for additive ridge functions, convolutional neural networks followed by one fully connected layer with ReLU activation functions can reach optimal mini-max rates.
arXiv Detail & Related papers (2022-02-24T14:22:32Z)
Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation [18.224344440110862]
The proposed algorithm directs the parameters of the compressed model toward a flatter solution by exploring the spectral radius of Hessian. Our result shows that it improves the state-of-the-art results on neuron compression. The method is able to achieve very small networks with small accuracy across different neural network models.
arXiv Detail & Related papers (2021-11-16T15:55:59Z)
Neural Network Pruning Through Constrained Reinforcement Learning [3.2880869992413246]
We propose a general methodology for pruning neural networks. Our proposed methodology can prune neural networks to respect pre-defined computational budgets. We prove the effectiveness of our approach via comparison with state-of-the-art methods on standard image classification datasets.
arXiv Detail & Related papers (2021-10-16T11:57:38Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.