Embedded methods for feature selection in neural networks
- URL: http://arxiv.org/abs/2010.05834v1
- Date: Mon, 12 Oct 2020 16:33:46 GMT
- Title: Embedded methods for feature selection in neural networks
- Authors: Vinay Varma K
- Abstract summary: Black box models like neural networks negatively affect the interpretability, generalizability, and the training time of these models.
I propose two integrated approaches for feature selection that can be incorporated directly into the parameter learning.
I benchmarked both the methods against Permutation Feature Importance (PFI) - a general-purpose feature ranking method and a random baseline.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The representational capacity of modern neural network architectures has made
them a default choice in various applications with high dimensional feature
sets. But these high dimensional and potentially noisy features combined with
the black box models like neural networks negatively affect the
interpretability, generalizability, and the training time of these models.
Here, I propose two integrated approaches for feature selection that can be
incorporated directly into the parameter learning. One of them involves adding
a drop-in layer and performing sequential weight pruning. The other is a
sensitivity-based approach. I benchmarked both the methods against Permutation
Feature Importance (PFI) - a general-purpose feature ranking method and a
random baseline. The suggested approaches turn out to be viable methods for
feature selection, consistently outperform the baselines on the tested datasets
- MNIST, ISOLET, and HAR. We can add them to any existing model with only a few
lines of code.
Related papers
- A Performance-Driven Benchmark for Feature Selection in Tabular Deep
Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones.
Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance.
We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers.
We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Model-based feature selection for neural networks: A mixed-integer
programming approach [0.9281671380673306]
We develop a novel input feature selection framework for ReLU-based deep neural networks (DNNs)
We focus on finding input features for image classification for clarity of presentation.
We show that the proposed input feature selection allows us to drastically reduce the size of the input to $sim$15% while maintaining a good classification accuracy.
arXiv Detail & Related papers (2023-02-20T22:19:50Z) - The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features.
An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z) - Automated Algorithm Selection: from Feature-Based to Feature-Free
Approaches [0.5801044612920815]
We propose a novel technique for algorithm-selection, applicable to optimisation in which there is implicit sequential information encapsulated in the data.
We train two types of recurrent neural networks to predict a packing in online bin-packing, selecting from four well-known domains.
arXiv Detail & Related papers (2022-03-24T23:59:50Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Binary Stochastic Filtering: feature selection and beyond [0.0]
This work aims at extending the neural network with ability to automatically select features by rethinking how the sparsity regularization can be used.
The proposed method has demonstrated superior efficiency when compared to a few classical methods, achieved with minimal or no computational overhead.
arXiv Detail & Related papers (2020-07-08T06:57:10Z) - Learning to Encode Position for Transformer with Continuous Dynamical
Model [88.69870971415591]
We introduce a new way of learning to encode position information for non-recurrent models, such as Transformer models.
We model the evolution of encoded results along position index by such a dynamical system.
arXiv Detail & Related papers (2020-03-13T00:41:41Z) - PointHop++: A Lightweight Learning Model on Point Sets for 3D
Classification [55.887502438160304]
The PointHop method was recently proposed by Zhang et al. for 3D point cloud classification with unsupervised feature extraction.
We improve the PointHop method furthermore in two aspects: 1) reducing its model complexity in terms of the model parameter number and 2) ordering discriminant features automatically based on the cross-entropy criterion.
With experiments conducted on the ModelNet40 benchmark dataset, we show that the PointHop++ method performs on par with deep neural network (DNN) solutions and surpasses other unsupervised feature extraction methods.
arXiv Detail & Related papers (2020-02-09T04:49:32Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.