Feed-Forward Neural Networks as a Mixed-Integer Program
- URL: http://arxiv.org/abs/2402.06697v1
- Date: Fri, 9 Feb 2024 02:23:37 GMT
- Title: Feed-Forward Neural Networks as a Mixed-Integer Program
- Authors: Navid Aftabi and Nima Moradi and Fatemeh Mahroo
- Abstract summary: The research focuses on training and evaluating proposed approaches through experiments on handwritten digit classification models.
The study assesses the performance of trained ReLU NNs, shedding light on the effectiveness of MIP formulations in enhancing training processes for NNs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are widely studied in various applications. A DNN
consists of layers of neurons that compute affine combinations, apply nonlinear
operations, and produce corresponding activations. The rectified linear unit
(ReLU) is a typical nonlinear operator, outputting the max of its input and
zero. In scenarios like max pooling, where multiple input values are involved,
a fixed-parameter DNN can be modeled as a mixed-integer program (MIP). This
formulation, with continuous variables representing unit outputs and binary
variables for ReLU activation, finds applications across diverse domains. This
study explores the formulation of trained ReLU neurons as MIP and applies MIP
models for training neural networks (NNs). Specifically, it investigates
interactions between MIP techniques and various NN architectures, including
binary DNNs (employing step activation functions) and binarized DNNs (with
weights and activations limited to $-1,0,+1$). The research focuses on training
and evaluating proposed approaches through experiments on handwritten digit
classification models. The comparative study assesses the performance of
trained ReLU NNs, shedding light on the effectiveness of MIP formulations in
enhancing training processes for NNs.
Related papers
- Splitting physics-informed neural networks for inferring the dynamics of
integer- and fractional-order neuron models [0.0]
We introduce a new approach for solving forward systems of differential equations using a combination of splitting methods and physics-informed neural networks (PINNs)
The proposed method, splitting PINN, effectively addresses the challenge of applying PINNs to forward dynamical systems.
arXiv Detail & Related papers (2023-04-26T00:11:00Z) - Analyzing Populations of Neural Networks via Dynamical Model Embedding [10.455447557943463]
A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task.
Motivated by this problem, we introduce DYNAMO, an algorithm that constructs low-dimensional manifold where each point corresponds to a neural network model, and two points are nearby if the corresponding neural networks enact similar high-level computational processes.
DYNAMO takes as input a collection of pre-trained neural networks and outputs a meta-model that emulates the dynamics of the hidden states as well as the outputs of any model in the collection.
arXiv Detail & Related papers (2023-02-27T19:00:05Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Neural Born Iteration Method For Solving Inverse Scattering Problems: 2D
Cases [3.795881624409311]
We propose the neural Born iterative method (Neural BIM) for solving 2D inverse scattering problems (ISPs)
Neural BIM employs independent convolutional neural networks (CNNs) to learn the alternate update rules of two different candidate solutions regarding the residuals.
Two different schemes are presented in this paper, including the supervised and unsupervised learning schemes.
arXiv Detail & Related papers (2021-12-18T03:22:41Z) - Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM)
Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z) - Efficient and Robust Mixed-Integer Optimization Methods for Training
Binarized Deep Neural Networks [0.07614628596146598]
We study deep neural networks with binary activation functions and continuous or integer weights (BDNN)
We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers.
For the first time a robust model is presented which enforces robustness of the BDNN during training.
arXiv Detail & Related papers (2021-10-21T18:02:58Z) - Exploiting Heterogeneity in Operational Neural Networks by Synaptic
Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs)
In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons.
Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Recurrent Neural Network Learning of Performance and Intrinsic
Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics.
We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model.
Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z) - Self-Organized Operational Neural Networks with Generative Neurons [87.32169414230822]
ONNs are heterogenous networks with a generalized neuron model that can encapsulate any set of non-linear operators.
We propose Self-organized ONNs (Self-ONNs) with generative neurons that have the ability to adapt (optimize) the nodal operator of each connection.
arXiv Detail & Related papers (2020-04-24T14:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.