A foundation for exact binarized morphological neural networks
- URL: http://arxiv.org/abs/2401.03830v1
- Date: Mon, 8 Jan 2024 11:37:44 GMT
- Title: A foundation for exact binarized morphological neural networks
- Authors: Theodore Aouad, Hugues Talbot
- Abstract summary: Training and running deep neural networks (NNs) often demands a lot of computation and energy-intensive specialized hardware.
One way to reduce the computation and power cost is to use binary weight NNs, but these are hard to train because the sign function has a non-smooth gradient.
We present a model based on Mathematical Morphology (MM), which can binarize ConvNets without losing performance under certain conditions.
- Score: 2.8925699537310137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training and running deep neural networks (NNs) often demands a lot of
computation and energy-intensive specialized hardware (e.g. GPU, TPU...). One
way to reduce the computation and power cost is to use binary weight NNs, but
these are hard to train because the sign function has a non-smooth gradient. We
present a model based on Mathematical Morphology (MM), which can binarize
ConvNets without losing performance under certain conditions, but these
conditions may not be easy to satisfy in real-world scenarios. To solve this,
we propose two new approximation methods and develop a robust theoretical
framework for ConvNets binarization using MM. We propose as well regularization
losses to improve the optimization. We empirically show that our model can
learn a complex morphological network, and explore its performance on a
classification task.
Related papers
- Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors [4.95475852994362]
We propose a new form of quantization to tile neural network layers with sequences of bits to achieve sub-bit compression of binary-weighted neural networks.
We employ the approach to both fully-connected and convolutional layers, which make up the breadth of space in most neural architectures.
arXiv Detail & Related papers (2024-07-16T15:55:38Z) - Optimization Over Trained Neural Networks: Taking a Relaxing Walk [4.517039147450688]
We propose a more scalable solver based on exploring global and local linear relaxations of the neural network model.
Our solver is competitive with a state-of-the-art MILP solver and the prior while producing better solutions with increases in input, depth, and number of neurons.
arXiv Detail & Related papers (2024-01-07T11:15:00Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Combinatorial optimization for low bit-width neural networks [23.466606660363016]
Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources.
Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting.
We show that a combination of greedy coordinate descent and this novel approach can attain competitive accuracy on binary classification tasks.
arXiv Detail & Related papers (2022-06-04T15:02:36Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - Can we learn gradients by Hamiltonian Neural Networks? [68.8204255655161]
We propose a meta-learner based on ODE neural networks that learns gradients.
We demonstrate that our method outperforms a meta-learner based on LSTM for an artificial task and the MNIST dataset with ReLU activations in the optimizee.
arXiv Detail & Related papers (2021-10-31T18:35:10Z) - Leveraging power grid topology in machine learning assisted optimal
power flow [0.5076419064097734]
Machine learning assisted optimal power flow (OPF) aims to reduce the computational complexity of non-linear and non- constrained power flow problems.
We assess the performance of a variety of FCNN, CNN and GNN models for two fundamental approaches to machine assisted OPF.
For several synthetic grids with interconnected utilities, we show that locality properties between feature and target variables are scarce.
arXiv Detail & Related papers (2021-10-01T10:39:53Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.