Training morphological neural networks with gradient descent: some theoretical insights
- URL: http://arxiv.org/abs/2403.12975v2
- Date: Mon, 1 Jul 2024 07:40:03 GMT
- Title: Training morphological neural networks with gradient descent: some theoretical insights
- Authors: Samy Blusseau,
- Abstract summary: We investigate the potential and limitations of differentiation based approaches and back-propagation applied to morphological networks.
We provide insights and first theoretical guidelines, in particular regarding learning rates.
- Score: 0.40792653193642503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Morphological neural networks, or layers, can be a powerful tool to boost the progress in mathematical morphology, either on theoretical aspects such as the representation of complete lattice operators, or in the development of image processing pipelines. However, these architectures turn out to be difficult to train when they count more than a few morphological layers, at least within popular machine learning frameworks which use gradient descent based optimization algorithms. In this paper we investigate the potential and limitations of differentiation based approaches and back-propagation applied to morphological networks, in light of the non-smooth optimization concept of Bouligand derivative. We provide insights and first theoretical guidelines, in particular regarding initialization and learning rates.
Related papers
- CF-OPT: Counterfactual Explanations for Structured Prediction [47.36059095502583]
optimization layers in deep neural networks have enjoyed a growing popularity in structured learning, improving the state of the art on a variety of applications.
Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, such as a deep neural network, and an optimization layer, which is typically a complex black-box solver.
Our goal is to improve the transparency of such methods by providing counterfactual explanations.
arXiv Detail & Related papers (2024-05-28T15:48:27Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Deep Learning Meets Sparse Regularization: A Signal Processing
Perspective [17.12783792226575]
We present a mathematical framework that characterizes the functional properties of neural networks that are trained to fit to data.
Key mathematical tools which support this framework include transform-domain sparse regularization, the Radon transform of computed tomography, and approximation theory.
This framework explains the effect of weight decay regularization in neural network training, the use of skip connections and low-rank weight matrices in network architectures, the role of sparsity in neural networks, and explains why neural networks can perform well in high-dimensional problems.
arXiv Detail & Related papers (2023-01-23T17:16:21Z) - Learning without gradient descent encoded by the dynamics of a
neurobiological model [7.952666139462592]
We introduce a conceptual approach to machine learning that takes advantage of a neurobiologically derived model of dynamic signaling.
We show that MNIST images can be uniquely encoded and classified by the dynamics of geometric networks with nearly state-of-the-art accuracy in an unsupervised way.
arXiv Detail & Related papers (2021-03-16T07:03:04Z) - Going beyond p-convolutions to learn grayscale morphological operators [64.38361575778237]
We present two new morphological layers based on the same principle as the p-convolutional layer.
In this work, we present two new morphological layers based on the same principle as the p-convolutional layer.
arXiv Detail & Related papers (2021-02-19T17:22:16Z) - Advances in the training, pruning and enforcement of shape constraints
of Morphological Neural Networks using Tropical Algebra [40.327435646554115]
We study neural networks based on the morphological operators of dilation and erosion.
Our contributions include the training of morphological networks via Difference-of-Convex programming methods and extend a binary morphological to multiclass tasks.
arXiv Detail & Related papers (2020-11-15T22:44:25Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.
In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training.
Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.