Related papers: End-To-End Data-Dependent Routing in Multi-Path Neural Networks

End-To-End Data-Dependent Routing in Multi-Path Neural Networks

URL: http://arxiv.org/abs/2107.02450v1
Date: Tue, 6 Jul 2021 07:58:07 GMT
Title: End-To-End Data-Dependent Routing in Multi-Path Neural Networks
Authors: Dumindu Tissera, Kasun Vithanage, Rukshan Wijessinghe, Subha Fernando, Ranga Rodrigo
Abstract summary: We propose the use of multi-path neural networks with data-dependent resource allocation among parallel computations within layers. Our networks show superior performance to existing widening and adaptive feature extraction, and even ensembles, and deeper networks at similar complexity in the image recognition task.
Score: 0.9507070656654633
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks are known to give better performance with increased depth due to their ability to learn more abstract features. Although the deepening of networks has been well established, there is still room for efficient feature extraction within a layer which would reduce the need for mere parameter increment. The conventional widening of networks by having more filters in each layer introduces a quadratic increment of parameters. Having multiple parallel convolutional/dense operations in each layer solves this problem, but without any context-dependent allocation of resources among these operations: the parallel computations tend to learn similar features making the widening process less effective. Therefore, we propose the use of multi-path neural networks with data-dependent resource allocation among parallel computations within layers, which also lets an input to be routed end-to-end through these parallel paths. To do this, we first introduce a cross-prediction based algorithm between parallel tensors of subsequent layers. Second, we further reduce the routing overhead by introducing feature-dependent cross-connections between parallel tensors of successive layers. Our multi-path networks show superior performance to existing widening and adaptive feature extraction, and even ensembles, and deeper networks at similar complexity in the image recognition task.

Related papers

Layer Collaboration in the Forward-Forward Algorithm [28.856139738073626]
We study layer collaboration in the forward-forward algorithm. We show that the current version of the forward-forward algorithm is suboptimal when considering information flow in the network. We propose an improved version that supports layer collaboration to better utilize the network structure.
arXiv Detail & Related papers (2023-05-21T08:12:54Z)
Sparse Interaction Additive Networks via Feature Interaction Detection and Sparse Selection [10.191597755296163]
We develop a tractable selection algorithm to efficiently identify the necessary feature combinations. Our proposed Sparse Interaction Additive Networks (SIAN) construct a bridge from simple and interpretable models to fully connected neural networks.
arXiv Detail & Related papers (2022-09-19T19:57:17Z)
LayerPipe: Accelerating Deep Neural Network Training by Intra-Layer and Inter-Layer Gradient Pipelining and Multiprocessor Scheduling [6.549125450209931]
Training model parameters by backpropagation inherently create feedback loops. The proposed system, referred to as LayerPipe, reduces the number of clock cycles required for training.
arXiv Detail & Related papers (2021-08-14T23:51:00Z)
Layer Folding: Neural Network Depth Reduction using Activation Linearization [0.0]
Modern devices exhibit a high level of parallelism, but real-time latency is still highly dependent on networks' depth. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We apply our method to networks pre-trained on CIFAR-10 and CIFAR-100 and find that they can all be transformed into shallower forms that share a similar depth.
arXiv Detail & Related papers (2021-06-17T08:22:46Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems. The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z)
Recursive Multi-model Complementary Deep Fusion forRobust Salient Object Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field. This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures. Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z)
Feature-Dependent Cross-Connections in Multi-Path Neural Networks [7.230526683545722]
Multi-path networks tend to learn redundant features. We introduce a mechanism to intelligently allocate incoming feature maps to such paths. We show improved image recognition accuracy at a similar complexity compared to conventional and state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T17:38:03Z)
Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies. We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning. Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers. Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z)
Convolutional Networks with Dense Connectivity [59.30634544498946]
We introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks.
arXiv Detail & Related papers (2020-01-08T06:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.