SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification
- URL: http://arxiv.org/abs/2110.02776v1
- Date: Wed, 6 Oct 2021 13:54:49 GMT
- Title: SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification
- Authors: Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti
- Abstract summary: In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task.
The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders.
To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
- Score: 28.02302915971059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Improving existing neural network architectures can involve several design
choices such as manipulating the loss functions, employing a diverse learning
strategy, exploiting gradient evolution at training time, optimizing the
network hyper-parameters, or increasing the architecture depth. The latter
approach is a straightforward solution, since it directly enhances the
representation capabilities of a network; however, the increased depth
generally incurs in the well-known vanishing gradient problem. In this paper,
borrowing from different methods addressing this issue, we introduce an
interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing
gradient in relation to the object classification task. The presented
methodology directly improves a convolutional neural network (CNN) by enforcing
the input image structure preservation through interlaced auto-encoders, and
further refines the base network architecture by means of skip and residual
connections. To validate the presented methodology, a simple CNN and various
implementations of famous networks are extended via the SIRe strategy and
extensively tested on the CIFAR100 dataset; where the SIRe-extended
architectures achieve significantly increased performances across all models,
thus confirming the presented approach effectiveness.
Related papers
- Enhancing Convolutional Neural Networks with Higher-Order Numerical Difference Methods [6.26650196870495]
Convolutional Neural Networks (CNNs) have been able to assist humans in solving many real-world problems.
This paper proposes a stacking scheme based on the linear multi-step method to enhance the performance of CNNs.
arXiv Detail & Related papers (2024-09-08T05:13:58Z) - GFN: A graph feedforward network for resolution-invariant reduced operator learning in multifidelity applications [0.0]
This work presents a novel resolution-invariant model order reduction strategy for multifidelity applications.
We base our architecture on a novel neural network layer developed in this work, the graph feedforward network.
We exploit the method's capability of training and testing on different mesh sizes in an autoencoder-based reduction strategy for parametrised partial differential equations.
arXiv Detail & Related papers (2024-06-05T18:31:37Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - RDRN: Recursively Defined Residual Network for Image Super-Resolution [58.64907136562178]
Deep convolutional neural networks (CNNs) have obtained remarkable performance in single image super-resolution.
We propose a novel network architecture which utilizes attention blocks efficiently.
arXiv Detail & Related papers (2022-11-17T11:06:29Z) - Graph-based Algorithm Unfolding for Energy-aware Power Allocation in
Wireless Networks [27.600081147252155]
We develop a novel graph sumable framework to maximize energy efficiency in wireless communication networks.
We show the permutation training which is a desirable property for models of wireless network data.
Results demonstrate its generalizability across different network topologies.
arXiv Detail & Related papers (2022-01-27T20:23:24Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Multiresolution Convolutional Autoencoders [5.0169726108025445]
We propose a multi-resolution convolutional autoencoder architecture that integrates and leverages three successful mathematical architectures.
Basic learning techniques are applied to ensure information learned from previous training steps can be rapidly transferred to the larger network.
The performance gains are illustrated through a sequence of numerical experiments on synthetic examples and real-world spatial data.
arXiv Detail & Related papers (2020-04-10T08:31:59Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.