WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
- URL: http://arxiv.org/abs/2301.01352v1
- Date: Tue, 3 Jan 2023 20:57:22 GMT
- Title: WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
- Authors: Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis and Moncef
Gabbouj
- Abstract summary: Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
- Score: 98.78384185493624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are composed of multiple layers arranged in a hierarchical
structure jointly trained with a gradient-based optimization, where the errors
are back-propagated from the last layer back to the first one. At each
optimization step, neurons at a given layer receive feedback from neurons
belonging to higher layers of the hierarchy. In this paper, we propose to
complement this traditional 'between-layer' feedback with additional
'within-layer' feedback to encourage the diversity of the activations within
the same layer. To this end, we measure the pairwise similarity between the
outputs of the neurons and use it to model the layer's overall diversity. We
present an extensive empirical study confirming that the proposed approach
enhances the performance of several state-of-the-art neural network models in
multiple tasks. The code is publically available at
\url{https://github.com/firasl/AAAI-23-WLD-Reg}
Related papers
- Discovering Message Passing Hierarchies for Mesh-Based Physics Simulation [61.89682310797067]
We introduce DHMP, which learns Dynamic Hierarchies for Message Passing networks through a differentiable node selection method.
Our experiments demonstrate the effectiveness of DHMP, achieving 22.7% improvement on average compared to recent fixed-hierarchy message passing networks.
arXiv Detail & Related papers (2024-10-03T15:18:00Z) - Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Towards Optimal Customized Architecture for Heterogeneous Federated
Learning with Contrastive Cloud-Edge Model Decoupling [20.593232086762665]
Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting.
We propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning.
Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal.
arXiv Detail & Related papers (2024-03-04T05:10:28Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Learning the Right Layers: a Data-Driven Layer-Aggregation Strategy for
Semi-Supervised Learning on Multilayer Graphs [2.752817022620644]
Clustering (or community detection) on multilayer graphs poses several additional complications.
One of the major challenges is to establish the extent to which each layer contributes to the cluster iteration assignment.
We propose a parameter-free Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels.
arXiv Detail & Related papers (2023-05-31T19:50:11Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Multilevel-in-Layer Training for Deep Neural Network Regression [1.6185544531149159]
We present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks.
We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer.
arXiv Detail & Related papers (2022-11-11T23:53:46Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Breaking Batch Normalization for better explainability of Deep Neural
Networks through Layer-wise Relevance Propagation [2.654526698055524]
We build an equivalent network fusing normalization layers and convolutional or fully connected layers.
Heatmaps obtained with our method on MNIST and CIFAR 10 datasets are more accurate for convolutional layers.
arXiv Detail & Related papers (2020-02-24T13:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.