Consistent Feature Selection for Analytic Deep Neural Networks
- URL: http://arxiv.org/abs/2010.08097v1
- Date: Fri, 16 Oct 2020 01:59:53 GMT
- Title: Consistent Feature Selection for Analytic Deep Neural Networks
- Authors: Vu Dinh, Lam Si Tung Ho
- Abstract summary: We investigate the problem of feature selection for analytic deep networks.
We prove that for a wide class of networks, the Adaptive Group Lasso selection procedure with Group Lasso is selection-consistent.
The work provides further evidence that Group Lasso might be inefficient for feature selection with neural networks.
- Score: 3.42658286826597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most important steps toward interpretability and explainability of
neural network models is feature selection, which aims to identify the subset
of relevant features. Theoretical results in the field have mostly focused on
the prediction aspect of the problem with virtually no work on feature
selection consistency for deep neural networks due to the model's severe
nonlinearity and unidentifiability. This lack of theoretical foundation casts
doubt on the applicability of deep learning to contexts where correct
interpretations of the features play a central role.
In this work, we investigate the problem of feature selection for analytic
deep networks. We prove that for a wide class of networks, including deep
feed-forward neural networks, convolutional neural networks, and a major
sub-class of residual neural networks, the Adaptive Group Lasso selection
procedure with Group Lasso as the base estimator is selection-consistent. The
work provides further evidence that Group Lasso might be inefficient for
feature selection with neural networks and advocates the use of Adaptive Group
Lasso over the popular Group Lasso.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Effective Subset Selection Through The Lens of Neural Network Pruning [31.43307762723943]
It is important to select the data to be annotated wisely, which is known as the subset selection problem.
We investigate the relationship between subset selection and neural network pruning, which is more widely studied.
We propose utilizing the norm criterion of neural network features to improve subset selection methods.
arXiv Detail & Related papers (2024-06-03T08:12:32Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks.
We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z) - Adaptive Group Lasso Neural Network Models for Functions of Few
Variables and Time-Dependent Data [4.18804572788063]
We approximate the target function by a deep neural network and enforce an adaptive group Lasso constraint to the weights of a suitable hidden layer.
Our empirical studies show that the proposed method outperforms recent state-of-the-art methods including the sparse dictionary matrix method.
arXiv Detail & Related papers (2021-08-24T16:16:46Z) - Provably Training Neural Network Classifiers under Fairness Constraints [70.64045590577318]
We show that overparametrized neural networks could meet the constraints.
Key ingredient of building a fair neural network classifier is establishing no-regret analysis for neural networks.
arXiv Detail & Related papers (2020-12-30T18:46:50Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Consistent feature selection for neural networks via Adaptive Group
Lasso [3.42658286826597]
We propose and establish a theoretical guarantee for the use of the adaptive group for selecting important features of neural networks.
Specifically, we show that our feature selection method is consistent for single-output feed-forward neural networks with one hidden layer and hyperbolic tangent activation function.
arXiv Detail & Related papers (2020-05-30T18:50:56Z) - Beyond Dropout: Feature Map Distortion to Regularize Deep Neural
Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks.
We propose a feature distortion method (Disout) for addressing the aforementioned problem.
The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.