Nonparametric Feature Selection by Random Forests and Deep Neural
Networks
- URL: http://arxiv.org/abs/2201.06821v1
- Date: Tue, 18 Jan 2022 08:45:33 GMT
- Title: Nonparametric Feature Selection by Random Forests and Deep Neural
Networks
- Authors: Xiaojun Mao, Liuhua Peng and Zhonglei Wang
- Abstract summary: We propose a nonparametric feature selection algorithm that incorporates random forests and deep neural networks.
Although the algorithm is proposed using standard random forests, it can be widely adapted to other machine learning algorithms.
- Score: 4.232614032390374
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Random forests are a widely used machine learning algorithm, but their
computational efficiency is undermined when applied to large-scale datasets
with numerous instances and useless features. Herein, we propose a
nonparametric feature selection algorithm that incorporates random forests and
deep neural networks, and its theoretical properties are also investigated
under regularity conditions. Using different synthetic models and a real-world
example, we demonstrate the advantage of the proposed algorithm over other
alternatives in terms of identifying useful features, avoiding useless ones,
and the computation efficiency. Although the algorithm is proposed using
standard random forests, it can be widely adapted to other machine learning
algorithms, as long as features can be sorted accordingly.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - Sample Complexity of Algorithm Selection Using Neural Networks and Its Applications to Branch-and-Cut [1.4624458429745086]
We build upon recent work in this line of research by considering the setup where, instead of selecting a single algorithm that has the best performance, we allow the possibility of selecting an algorithm based on the instance to be solved.
In particular, given a representative sample of instances, we learn a neural network that maps an instance of the problem to the most appropriate algorithm for that instance.
In other words, the neural network will take as input a mixed-integer optimization instance and output a decision that will result in a small branch-and-cut tree for that instance.
arXiv Detail & Related papers (2024-02-04T03:03:27Z) - Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks.
We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z) - Efficient Learning of Minimax Risk Classifiers in High Dimensions [3.093890460224435]
High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands.
In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers.
Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios.
arXiv Detail & Related papers (2023-06-11T11:08:20Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Waypoint Planning Networks [66.72790309889432]
We propose a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm.
We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value networks (VIN)
It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results.
arXiv Detail & Related papers (2021-05-01T18:02:01Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - E2E-FS: An End-to-End Feature Selection Method for Neural Networks [0.3222802562733786]
We present a novel selection algorithm, called EndtoEnd Feature Selection (E2FS)
Our algorithm, similar to the lasso approach, is solved with gradient descent techniques.
Although hard restrictions, experimental results show that this algorithm can be used with any learning model.
arXiv Detail & Related papers (2020-12-14T16:19:25Z) - Modeling Text with Decision Forests using Categorical-Set Splits [2.434796198711328]
In axis-aligned decision forests, the "decision" to route an input example is the result of the evaluation of a condition on a single dimension in the feature space.
We define a condition that is specific to categorical-set features and present an algorithm to learn it.
Our algorithm is efficient during training and the resulting conditions are fast to evaluate with our extension of the QuickScorer inference algorithm.
arXiv Detail & Related papers (2020-09-21T16:16:35Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.