Related papers: Model-based feature selection for neural networks: A mixed-integer programming approach

Model-based feature selection for neural networks: A mixed-integer programming approach

URL: http://arxiv.org/abs/2302.10344v1
Date: Mon, 20 Feb 2023 22:19:50 GMT
Title: Model-based feature selection for neural networks: A mixed-integer programming approach
Authors: Shudian Zhao, Calvin Tsay, Jan Kronqvist
Abstract summary: We develop a novel input feature selection framework for ReLU-based deep neural networks (DNNs) We focus on finding input features for image classification for clarity of presentation. We show that the proposed input feature selection allows us to drastically reduce the size of the input to $sim$15% while maintaining a good classification accuracy.
Score: 0.9281671380673306
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we develop a novel input feature selection framework for ReLU-based deep neural networks (DNNs), which builds upon a mixed-integer optimization approach. While the method is generally applicable to various classification tasks, we focus on finding input features for image classification for clarity of presentation. The idea is to use a trained DNN, or an ensemble of trained DNNs, to identify the salient input features. The input feature selection is formulated as a sequence of mixed-integer linear programming (MILP) problems that find sets of sparse inputs that maximize the classification confidence of each category. These ''inverse'' problems are regularized by the number of inputs selected for each category and by distribution constraints. Numerical results on the well-known MNIST and FashionMNIST datasets show that the proposed input feature selection allows us to drastically reduce the size of the input to $\sim$15\% while maintaining a good classification accuracy. This allows us to design DNNs with significantly fewer connections, reducing computational effort and producing DNNs that are more robust towards adversarial attacks.

Related papers

Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Feed-Forward Neural Networks as a Mixed-Integer Program [0.0]
The research focuses on training and evaluating proposed approaches through experiments on handwritten digit classification models. The study assesses the performance of trained ReLU NNs, shedding light on the effectiveness of MIP formulations in enhancing training processes for NNs.
arXiv Detail & Related papers (2024-02-09T02:23:37Z)
Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks. We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z)
Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks [2.186901738997927]
For regression tasks, standard Gaussian processes (GPs) and deep neural networks (DNNs) provide natural uncertainty quantification (UQ) We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse. DVE is compatible with pretrained networks and incurs low computational overhead.
arXiv Detail & Related papers (2023-05-26T16:19:26Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
Verification-Aided Deep Ensemble Selection [4.290931412096984]
Deep neural networks (DNNs) have become the technology of choice for realizing a variety of complex tasks. Even an imperceptible perturbation to a correctly classified input can lead to misclassification by a DNN. This paper devises a methodology for identifying ensemble compositions that are less prone to simultaneous errors.
arXiv Detail & Related papers (2022-02-08T14:36:29Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural Networks [0.07614628596146598]
We study deep neural networks with binary activation functions and continuous or integer weights (BDNN) We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers. For the first time a robust model is presented which enforces robustness of the BDNN during training.
arXiv Detail & Related papers (2021-10-21T18:02:58Z)
Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF) The proposed SIDNN is compatible with a broad range of OPF schemes. It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z)
File Classification Based on Spiking Neural Networks [0.5065947993017157]
We propose a system for file classification in large data sets based on spiking neural networks (SNNs) The proposed system may represent a valid alternative to classical machine learning algorithms for inference tasks.
arXiv Detail & Related papers (2020-04-08T11:50:29Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.