Efficient and Robust Mixed-Integer Optimization Methods for Training
Binarized Deep Neural Networks
- URL: http://arxiv.org/abs/2110.11382v2
- Date: Mon, 25 Oct 2021 10:38:41 GMT
- Title: Efficient and Robust Mixed-Integer Optimization Methods for Training
Binarized Deep Neural Networks
- Authors: Jannis Kurtz and Bubacarr Bah
- Abstract summary: We study deep neural networks with binary activation functions and continuous or integer weights (BDNN)
We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers.
For the first time a robust model is presented which enforces robustness of the BDNN during training.
- Score: 0.07614628596146598
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compared to classical deep neural networks its binarized versions can be
useful for applications on resource-limited devices due to their reduction in
memory consumption and computational demands. In this work we study deep neural
networks with binary activation functions and continuous or integer weights
(BDNN). We show that the BDNN can be reformulated as a mixed-integer linear
program with bounded weight space which can be solved to global optimality by
classical mixed-integer programming solvers. Additionally, a local search
heuristic is presented to calculate locally optimal networks. Furthermore to
improve efficiency we present an iterative data-splitting heuristic which
iteratively splits the training set into smaller subsets by using the k-mean
method. Afterwards all data points in a given subset are forced to follow the
same activation pattern, which leads to a much smaller number of integer
variables in the mixed-integer programming formulation and therefore to
computational improvements. Finally for the first time a robust model is
presented which enforces robustness of the BDNN during training. All methods
are tested on random and real datasets and our results indicate that all models
can often compete with or even outperform classical DNNs on small network
architectures confirming the viability for applications having restricted
memory or computing power.
Related papers
- Model-Based Control with Sparse Neural Dynamics [23.961218902837807]
We propose a new framework for integrated model learning and predictive control.
We show that our framework can deliver better closed-loop performance than existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-20T06:25:02Z) - Does a sparse ReLU network training problem always admit an optimum? [0.0]
We show that the existence of an optimal solution is not always guaranteed, especially in the context of sparse ReLU neural networks.
In particular, we first show that optimization problems involving deep networks with certain sparsity patterns do not always have optimal parameters.
arXiv Detail & Related papers (2023-06-05T08:01:50Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Partitioning sparse deep neural networks for scalable training and
inference [8.282177703075453]
State-of-the-art deep neural networks (DNNs) have significant computational and data management requirements.
Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs.
The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning.
arXiv Detail & Related papers (2021-04-23T20:05:52Z) - Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one.
Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP.
We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - An Integer Programming Approach to Deep Neural Networks with Binary
Activation Functions [0.0]
We study deep neural networks with binary activation functions (BDNN)
We show that the BDNN can be reformulated as a mixed-integer linear program which can be solved to global optimality by classical programming solvers.
arXiv Detail & Related papers (2020-07-07T10:28:20Z) - Self-Organized Operational Neural Networks with Generative Neurons [87.32169414230822]
ONNs are heterogenous networks with a generalized neuron model that can encapsulate any set of non-linear operators.
We propose Self-organized ONNs (Self-ONNs) with generative neurons that have the ability to adapt (optimize) the nodal operator of each connection.
arXiv Detail & Related papers (2020-04-24T14:37:56Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.