Hybrid Gromov-Wasserstein Embedding for Capsule Learning
- URL: http://arxiv.org/abs/2209.00232v2
- Date: Tue, 24 Oct 2023 11:13:37 GMT
- Title: Hybrid Gromov-Wasserstein Embedding for Capsule Learning
- Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Swagatam Das, Eric Granger,
Salvador Garcia
- Abstract summary: Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relations using a two-step process.
hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential advantages.
We present an efficient approach for learning capsules that surpasses canonical baseline models and even demonstrates superior performance compared to high-performing convolution models.
- Score: 24.520120182880333
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Capsule networks (CapsNets) aim to parse images into a hierarchy of objects,
parts, and their relations using a two-step process involving part-whole
transformation and hierarchical component routing. However, this hierarchical
relationship modeling is computationally expensive, which has limited the wider
use of CapsNet despite its potential advantages. The current state of CapsNet
models primarily focuses on comparing their performance with capsule baselines,
falling short of achieving the same level of proficiency as deep CNN variants
in intricate tasks. To address this limitation, we present an efficient
approach for learning capsules that surpasses canonical baseline models and
even demonstrates superior performance compared to high-performing convolution
models. Our contribution can be outlined in two aspects: firstly, we introduce
a group of subcapsules onto which an input vector is projected. Subsequently,
we present the Hybrid Gromov-Wasserstein framework, which initially quantifies
the dissimilarity between the input and the components modeled by the
subcapsules, followed by determining their alignment degree through optimal
transport. This innovative mechanism capitalizes on new insights into defining
alignment between the input and subcapsules, based on the similarity of their
respective component distributions. This approach enhances CapsNets' capacity
to learn from intricate, high-dimensional data while retaining their
interpretability and hierarchical structure. Our proposed model offers two
distinct advantages: (i) its lightweight nature facilitates the application of
capsules to more intricate vision tasks, including object detection; (ii) it
outperforms baseline approaches in these demanding tasks.
Related papers
- Hierarchical Object-Centric Learning with Capsule Networks [0.0]
Capsule networks (CapsNets) were introduced to address convolutional neural networks limitations.
This thesis investigates the intriguing aspects of CapsNets and focuses on three key questions to unlock their full potential.
arXiv Detail & Related papers (2024-05-30T09:10:33Z) - ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method [6.028175460199198]
We introduce a novel, non-iterative routing mechanism for Capsule Networks.
We harness a shared Capsule subspace, negating the need to project each lower-level Capsule to each higher-level Capsule.
Our findings underscore the potential of our proposed methodology in enhancing the operational efficiency and performance of Capsule Networks.
arXiv Detail & Related papers (2023-07-19T12:39:40Z) - Routing with Self-Attention for Multimodal Capsule Networks [108.85007719132618]
We present a new multimodal capsule network that allows us to leverage the strength of capsules in the context of a multimodal learning framework.
To adapt the capsules to large-scale input data, we propose a novel routing by self-attention mechanism that selects relevant capsules.
This allows not only for robust training with noisy video data, but also to scale up the size of the capsule network compared to traditional routing methods.
arXiv Detail & Related papers (2021-12-01T19:01:26Z) - Optimising for Interpretability: Convolutional Dynamic Alignment
Networks [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets)
Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns.
CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-09-27T12:39:46Z) - ASPCNet: A Deep Adaptive Spatial Pattern Capsule Network for
Hyperspectral Image Classification [47.541691093680406]
This paper proposes an adaptive spatial pattern capsule network (ASPCNet) architecture.
It can rotate the sampling location of convolutional kernels on the basis of an enlarged receptive field.
Experiments on three public datasets demonstrate that ASPCNet can yield competitive performance with higher accuracies than state-of-the-art methods.
arXiv Detail & Related papers (2021-04-25T07:10:55Z) - Deformable Capsules for Object Detection [3.702343116848637]
We introduce a new family of capsule networks, deformable capsules (textitDeformCaps), to address a very important problem in computer vision: object detection.
We demonstrate that the proposed methods efficiently scale up to create the first-ever capsule network for object detection in the literature.
arXiv Detail & Related papers (2021-04-11T15:36:30Z) - CoADNet: Collaborative Aggregation-and-Distribution Networks for
Co-Salient Object Detection [91.91911418421086]
Co-Salient Object Detection (CoSOD) aims at discovering salient objects that repeatedly appear in a given query group containing two or more relevant images.
One challenging issue is how to effectively capture co-saliency cues by modeling and exploiting inter-image relationships.
We present an end-to-end collaborative aggregation-and-distribution network (CoADNet) to capture both salient and repetitive visual patterns from multiple images.
arXiv Detail & Related papers (2020-11-10T04:28:11Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - An Efficient Agreement Mechanism in CapsNets By Pairwise Product [13.247509552137487]
We propose a pairwise agreement mechanism to build capsules, inspired by the feature interactions of factorization machines (FMs)
We propose a new CapsNet architecture that combines the strengths of residual networks in representing low-level visual features and CapsNets in modeling the relationships of parts to wholes.
arXiv Detail & Related papers (2020-04-01T08:09:23Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.