WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
- URL: http://arxiv.org/abs/2301.01352v1
- Date: Tue, 3 Jan 2023 20:57:22 GMT
- Title: WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
- Authors: Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis and Moncef
Gabbouj
- Abstract summary: Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
- Score: 98.78384185493624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are composed of multiple layers arranged in a hierarchical
structure jointly trained with a gradient-based optimization, where the errors
are back-propagated from the last layer back to the first one. At each
optimization step, neurons at a given layer receive feedback from neurons
belonging to higher layers of the hierarchy. In this paper, we propose to
complement this traditional 'between-layer' feedback with additional
'within-layer' feedback to encourage the diversity of the activations within
the same layer. To this end, we measure the pairwise similarity between the
outputs of the neurons and use it to model the layer's overall diversity. We
present an extensive empirical study confirming that the proposed approach
enhances the performance of several state-of-the-art neural network models in
multiple tasks. The code is publically available at
\url{https://github.com/firasl/AAAI-23-WLD-Reg}
Related papers
- Towards Optimal Customized Architecture for Heterogeneous Federated
Learning with Contrastive Cloud-Edge Model Decoupling [20.593232086762665]
Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting.
We propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning.
Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal.
arXiv Detail & Related papers (2024-03-04T05:10:28Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Learning the Right Layers: a Data-Driven Layer-Aggregation Strategy for
Semi-Supervised Learning on Multilayer Graphs [2.752817022620644]
Clustering (or community detection) on multilayer graphs poses several additional complications.
One of the major challenges is to establish the extent to which each layer contributes to the cluster iteration assignment.
We propose a parameter-free Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels.
arXiv Detail & Related papers (2023-05-31T19:50:11Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Layerwise Sparsifying Training and Sequential Learning Strategy for
Neural Architecture Adaptation [0.0]
This work presents a two-stage framework for developing neural architectures to adapt/ generalize well on a given training data set.
In the first stage, a manifold-regularized layerwise sparsifying training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers.
In the second stage, a sequential learning process is adopted where a sequence of small networks is employed to extract information from the residual produced in stage I.
arXiv Detail & Related papers (2022-11-13T09:51:16Z) - Multilevel-in-Layer Training for Deep Neural Network Regression [1.6185544531149159]
We present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks.
We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer.
arXiv Detail & Related papers (2022-11-11T23:53:46Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Optimization Theory for ReLU Neural Networks Trained with Normalization
Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers.
Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Breaking Batch Normalization for better explainability of Deep Neural
Networks through Layer-wise Relevance Propagation [2.654526698055524]
We build an equivalent network fusing normalization layers and convolutional or fully connected layers.
Heatmaps obtained with our method on MNIST and CIFAR 10 datasets are more accurate for convolutional layers.
arXiv Detail & Related papers (2020-02-24T13:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.