Related papers: Capsules with Inverted Dot-Product Attention Routing

Capsules with Inverted Dot-Product Attention Routing

URL: http://arxiv.org/abs/2002.04764v2
Date: Wed, 26 Feb 2020 17:48:16 GMT
Title: Capsules with Inverted Dot-Product Attention Routing
Authors: Yao-Hung Hubert Tsai, Nitish Srivastava, Hanlin Goh, Ruslan Salakhutdinov
Abstract summary: We introduce a new routing algorithm for capsule networks, in which a child capsule is routed to a parent based only on agreement between the parent's state and the child's vote. Our method improves performance on benchmark datasets such as CIFAR-10 and CIFAR-100. We believe that our work raises the possibility of applying capsule networks to complex real-world tasks.
Score: 84.89818784286953
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a new routing algorithm for capsule networks, in which a child capsule is routed to a parent based only on agreement between the parent's state and the child's vote. The new mechanism 1) designs routing via inverted dot-product attention; 2) imposes Layer Normalization as normalization; and 3) replaces sequential iterative routing with concurrent iterative routing. When compared to previously proposed routing algorithms, our method improves performance on benchmark datasets such as CIFAR-10 and CIFAR-100, and it performs at-par with a powerful CNN (ResNet-18) with 4x fewer parameters. On a different task of recognizing digits from overlayed digit images, the proposed capsule model performs favorably against CNNs given the same number of layers and neurons per layer. We believe that our work raises the possibility of applying capsule networks to complex real-world tasks. Our code is publicly available at: https://github.com/apple/ml-capsules-inverted-attention-routing An alternative implementation is available at: https://github.com/yaohungt/Capsules-Inverted-Attention-Routing/blob/master/README.md

Related papers

OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning [21.5857226735951]
Redundancy is a persistent challenge in Capsule Networks (CapsNet) We propose an Orthogonal Capsule Network (OrthCaps) to reduce redundancy, improve routing performance and decrease parameter counts.
arXiv Detail & Related papers (2024-03-20T07:25:24Z)
AANet: Aggregation and Alignment Network with Semi-hard Positive Sample Mining for Hierarchical Place Recognition [48.043749855085025]
Visual place recognition (VPR) is one of the research hotspots in robotics, which uses visual information to locate robots. We present a unified network capable of extracting global features for retrieving candidates via an aggregation module. We also propose a Semi-hard Positive Sample Mining (ShPSM) strategy to select appropriate hard positive images for training more robust VPR networks.
arXiv Detail & Related papers (2023-10-08T14:46:11Z)
Learning Tracking Representations via Dual-Branch Fully Transformer Networks [82.21771581817937]
We present a Siamese-like Dual-branch network based on solely Transformers for tracking. We extract a feature vector for each patch based on its matching results with others within an attention window. The method achieves better or comparable results as the best-performing methods.
arXiv Detail & Related papers (2021-12-05T13:44:33Z)
Training Deep Capsule Networks with Residual Connections [0.0]
Capsule networks are a type of neural network that have recently gained increased popularity. They consist of groups of neurons, called capsules, which encode properties of objects or object parts. Most capsule network implementations use two to three capsule layers, which limits their applicability as expressivity grows exponentially with depth. We propose a methodology to train deeper capsule networks using residual connections, which is evaluated on four datasets and three different routing algorithms. Our experimental results show that in fact, performance increases when training deeper capsule networks.
arXiv Detail & Related papers (2021-04-15T11:42:44Z)
Routing Towards Discriminative Power of Class Capsules [7.347145775695176]
We propose a routing algorithm that incorporates a regularized quadratic programming problem which can be solved efficiently. We conduct experiments on MNIST, MNIST-Fashion, and CIFAR-10 and show competitive classification results compared to existing capsule networks.
arXiv Detail & Related papers (2021-03-07T05:49:38Z)
Sequential Routing Framework: Fully Capsule Network-based Speech Recognition [5.730259752695884]
This paper presents a sequential routing framework to adapt a CapsNet-only structure to sequence-to-sequence recognition. It achieves a 1.1% lower word error rate at 16.9% on the Wall Street Journal corpus. It attains a 0.7% lower phone error rate at 17.5% compared to convolutional neural network-based CTC networks.
arXiv Detail & Related papers (2020-07-23T01:51:41Z)
Wasserstein Routed Capsule Networks [90.16542156512405]
We propose a new parameter efficient capsule architecture, that is able to tackle complex tasks. We show that our network is able to substantially outperform other capsule approaches by over 1.2 % on CIFAR-10.
arXiv Detail & Related papers (2020-07-22T14:38:05Z)
ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting [105.97936163854693]
We propose ResRep, which slims down a CNN by reducing the width (number of output channels) of convolutional layers. Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re- parameterize a CNN into the remembering parts and forgetting parts. We equivalently merge the remembering and forgetting parts into the original architecture with narrower layers.
arXiv Detail & Related papers (2020-07-07T07:56:45Z)
Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations. We connect dynamic routing between capsules to the well-known Weiszfeld algorithm. Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.