No Routing Needed Between Capsules
- URL: http://arxiv.org/abs/2001.09136v6
- Date: Thu, 17 Jun 2021 20:14:13 GMT
- Title: No Routing Needed Between Capsules
- Authors: Adam Byerly, Tatiana Kalganova, Ian Dear
- Abstract summary: Homogeneous Vector Capsules (HVCs) use element-wise multiplication rather than matrix multiplication.
We show that a simple convolutional neural network using HVCs performs as well as the prior best performing capsule network on MNIST.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most capsule network designs rely on traditional matrix multiplication
between capsule layers and computationally expensive routing mechanisms to deal
with the capsule dimensional entanglement that the matrix multiplication
introduces. By using Homogeneous Vector Capsules (HVCs), which use element-wise
multiplication rather than matrix multiplication, the dimensions of the
capsules remain unentangled. In this work, we study HVCs as applied to the
highly structured MNIST dataset in order to produce a direct comparison to the
capsule research direction of Geoffrey Hinton, et al. In our study, we show
that a simple convolutional neural network using HVCs performs as well as the
prior best performing capsule network on MNIST using 5.5x fewer parameters, 4x
fewer training epochs, no reconstruction sub-network, and requiring no routing
mechanism. The addition of multiple classification branches to the network
establishes a new state of the art for the MNIST dataset with an accuracy of
99.87% for an ensemble of these models, as well as establishing a new state of
the art for a single model (99.83% accurate).
Related papers
- Deep multi-prototype capsule networks [0.3823356975862005]
Capsule networks are a type of neural network that identify image parts and form the instantiation parameters of a whole hierarchically.
This paper presents a multi-prototype architecture for guiding capsule networks to represent the variations in the image parts.
The experimental results on MNIST, SVHN, C-Cube, CEDAR, MCYT, and UTSig datasets reveal that the proposed model outperforms others regarding image classification accuracy.
arXiv Detail & Related papers (2024-04-23T18:37:37Z) - OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning [21.5857226735951]
Redundancy is a persistent challenge in Capsule Networks (CapsNet)
We propose an Orthogonal Capsule Network (OrthCaps) to reduce redundancy, improve routing performance and decrease parameter counts.
arXiv Detail & Related papers (2024-03-20T07:25:24Z) - Parameter-Efficient Masking Networks [61.43995077575439]
Advanced network designs often contain a large number of repetitive structures (e.g., Transformer)
In this study, we are the first to investigate the representative potential of fixed random weights with limited unique values by learning masks.
It leads to a new paradigm for model compression to diminish the model size.
arXiv Detail & Related papers (2022-10-13T03:39:03Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - A Deeper Look into Convolutions via Pruning [9.89901717499058]
Modern architectures contain a very small number of fully-connected layers, often at the end, after multiple layers of convolutions.
Although this strategy already reduces the number of parameters, most of the convolutions can be eliminated as well, without suffering any loss in recognition performance.
In this work, we use the matrix characteristics based on eigenvalues in addition to the classical weight-based importance assignment approach for pruning to shed light on the internal mechanisms of a widely used family of CNNs.
arXiv Detail & Related papers (2021-02-04T18:55:03Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.