Capsule networks with non-iterative cluster routing
- URL: http://arxiv.org/abs/2109.09213v1
- Date: Sun, 19 Sep 2021 20:14:22 GMT
- Title: Capsule networks with non-iterative cluster routing
- Authors: Zhihao Zhao, Samuel Cheng
- Abstract summary: In existing routing procedures, capsules produce predictions (termed votes) for capsules of the next layer.
In the proposed cluster routing, capsules produce vote clusters instead of individual votes for next-layer capsules.
The proposed capsule networks achieve the best accuracy on the Fashion-MNIST and SVHN datasets with fewer parameters.
- Score: 2.8935588665357077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capsule networks use routing algorithms to flow information between
consecutive layers. In the existing routing procedures, capsules produce
predictions (termed votes) for capsules of the next layer. In a nutshell, the
next-layer capsule's input is a weighted sum over all the votes it receives. In
this paper, we propose non-iterative cluster routing for capsule networks. In
the proposed cluster routing, capsules produce vote clusters instead of
individual votes for next-layer capsules, and each vote cluster sends its
centroid to a next-layer capsule. Generally speaking, the next-layer capsule's
input is a weighted sum over the centroid of each vote cluster it receives. The
centroid that comes from a cluster with a smaller variance is assigned a larger
weight in the weighted sum process. Compared with the state-of-the-art capsule
networks, the proposed capsule networks achieve the best accuracy on the
Fashion-MNIST and SVHN datasets with fewer parameters, and achieve the best
accuracy on the smallNORB and CIFAR-10 datasets with a moderate number of
parameters. The proposed capsule networks also produce capsules with
disentangled representation and generalize well to images captured at novel
viewpoints. The proposed capsule networks also preserve 2D spatial information
of an input image in the capsule channels: if the capsule channels are rotated,
the object reconstructed from these channels will be rotated by the same
transformation. Codes are available at
https://github.com/ZHAOZHIHAO/ClusterRouting.
Related papers
- Mamba Capsule Routing Towards Part-Whole Relational Camouflaged Object Detection [98.6460229237143]
We propose a novel mamba capsule routing at the type level.
These type-level mamba capsules are fed into the EM routing algorithm to get the high-layer mamba capsules.
On top of that, to retrieve the pixel-level capsule features for further camouflaged prediction, we achieve this on the basis of the low-layer pixel-level capsules.
arXiv Detail & Related papers (2024-10-05T00:20:22Z) - Deep multi-prototype capsule networks [0.3823356975862005]
Capsule networks are a type of neural network that identify image parts and form the instantiation parameters of a whole hierarchically.
This paper presents a multi-prototype architecture for guiding capsule networks to represent the variations in the image parts.
The experimental results on MNIST, SVHN, C-Cube, CEDAR, MCYT, and UTSig datasets reveal that the proposed model outperforms others regarding image classification accuracy.
arXiv Detail & Related papers (2024-04-23T18:37:37Z) - HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing
Capsule Network [76.92310948325847]
We propose a Hierarchical Parsing Capsule Network (HP-Capsule) for unsupervised face subpart-part discovery.
HP-Capsule extends the application of capsule networks from digits to human faces and takes a step forward to show how the neural networks understand objects without human intervention.
arXiv Detail & Related papers (2022-03-21T01:39:41Z) - ASPCNet: A Deep Adaptive Spatial Pattern Capsule Network for
Hyperspectral Image Classification [47.541691093680406]
This paper proposes an adaptive spatial pattern capsule network (ASPCNet) architecture.
It can rotate the sampling location of convolutional kernels on the basis of an enlarged receptive field.
Experiments on three public datasets demonstrate that ASPCNet can yield competitive performance with higher accuracies than state-of-the-art methods.
arXiv Detail & Related papers (2021-04-25T07:10:55Z) - Training Deep Capsule Networks with Residual Connections [0.0]
Capsule networks are a type of neural network that have recently gained increased popularity.
They consist of groups of neurons, called capsules, which encode properties of objects or object parts.
Most capsule network implementations use two to three capsule layers, which limits their applicability as expressivity grows exponentially with depth.
We propose a methodology to train deeper capsule networks using residual connections, which is evaluated on four datasets and three different routing algorithms.
Our experimental results show that in fact, performance increases when training deeper capsule networks.
arXiv Detail & Related papers (2021-04-15T11:42:44Z) - Capsule-Transformer for Neural Machine Translation [73.84254045203222]
Transformer hugely benefits from its key design of the multi-head self-attention network (SAN)
We propose the capsule-Transformer, which extends the linear transformation into a more general capsule routing algorithm.
Experimental results on the widely-used machine translation datasets show our proposed capsule-Transformer outperforms strong Transformer baseline significantly.
arXiv Detail & Related papers (2020-04-30T09:11:38Z) - Capsules with Inverted Dot-Product Attention Routing [84.89818784286953]
We introduce a new routing algorithm for capsule networks, in which a child capsule is routed to a parent based only on agreement between the parent's state and the child's vote.
Our method improves performance on benchmark datasets such as CIFAR-10 and CIFAR-100.
We believe that our work raises the possibility of applying capsule networks to complex real-world tasks.
arXiv Detail & Related papers (2020-02-12T02:09:33Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.