Vanishing Activations: A Symptom of Deep Capsule Networks
- URL: http://arxiv.org/abs/2305.11178v1
- Date: Sat, 13 May 2023 15:42:26 GMT
- Title: Vanishing Activations: A Symptom of Deep Capsule Networks
- Authors: Miles Everett, Mingjun Zhong and Georgios Leontidis
- Abstract summary: Capsule Networks are an extension to Neural Networks utilizing vector or matrix representations instead of scalars.
Early implementations of Capsule Networks achieved and maintain state-of-the-art results on various datasets.
Recent studies have revealed shortcomings in the original Capsule Network architecture.
- Score: 10.046549855562123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capsule Networks, an extension to Neural Networks utilizing vector or matrix
representations instead of scalars, were initially developed to create a
dynamic parse tree where visual concepts evolve from parts to complete objects.
Early implementations of Capsule Networks achieved and maintain
state-of-the-art results on various datasets. However, recent studies have
revealed shortcomings in the original Capsule Network architecture, notably its
failure to construct a parse tree and its susceptibility to vanishing gradients
when deployed in deeper networks. This paper extends the investigation to a
range of leading Capsule Network architectures, demonstrating that these issues
are not confined to the original design. We argue that the majority of Capsule
Network research has produced architectures that, while modestly divergent from
the original Capsule Network, still retain a fundamentally similar structure.
We posit that this inherent design similarity might be impeding the scalability
of Capsule Networks. Our study contributes to the broader discussion on
improving the robustness and scalability of Capsule Networks.
Related papers
- Mitigating Vanishing Activations in Deep CapsNets Using Channel Pruning [0.0]
Capsule Networks outperform Convolutional Neural Networks in learning the part-whole relationships with viewpoint invariance.
It was assumed that increasing the number of capsule layers in the capsule networks would enhance the model performance.
Recent studies found that Capsule Networks lack scalability due to vanishing activations in the capsules of deeper layers.
arXiv Detail & Related papers (2024-10-22T11:28:39Z) - Hierarchical Object-Centric Learning with Capsule Networks [0.0]
Capsule networks (CapsNets) were introduced to address convolutional neural networks limitations.
This thesis investigates the intriguing aspects of CapsNets and focuses on three key questions to unlock their full potential.
arXiv Detail & Related papers (2024-05-30T09:10:33Z) - Why Capsule Neural Networks Do Not Scale: Challenging the Dynamic
Parse-Tree Assumption [16.223322939363033]
Capsule neural networks replace simple, scalar-valued neurons with vector-valued capsules.
CapsNet is the first actual implementation of the conceptual idea of capsule neural networks.
No work was able to scale the CapsNet architecture to more reasonable-sized datasets.
arXiv Detail & Related papers (2023-01-04T12:59:51Z) - Towards Efficient Capsule Networks [7.1577508803778045]
Capsule Networks were introduced to enhance explainability of a model, where each capsule is an explicit representation of an object or its parts.
We show how pruning with Capsule Network achieves high generalization with less memory requirements, computational effort, and inference and training time.
arXiv Detail & Related papers (2022-08-19T08:03:25Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Learning with Capsules: A Survey [73.31150426300198]
Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations.
Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships.
arXiv Detail & Related papers (2022-06-06T15:05:36Z) - HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing
Capsule Network [76.92310948325847]
We propose a Hierarchical Parsing Capsule Network (HP-Capsule) for unsupervised face subpart-part discovery.
HP-Capsule extends the application of capsule networks from digits to human faces and takes a step forward to show how the neural networks understand objects without human intervention.
arXiv Detail & Related papers (2022-03-21T01:39:41Z) - Wasserstein Routed Capsule Networks [90.16542156512405]
We propose a new parameter efficient capsule architecture, that is able to tackle complex tasks.
We show that our network is able to substantially outperform other capsule approaches by over 1.2 % on CIFAR-10.
arXiv Detail & Related papers (2020-07-22T14:38:05Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z) - Examining the Benefits of Capsule Neural Networks [9.658250977094562]
Capsule networks are a newly developed class of neural networks that potentially address some of the deficiencies with traditional convolutional neural networks.
By replacing the standard scalar activations with vectors, capsule networks aim to be the next great development for computer vision applications.
arXiv Detail & Related papers (2020-01-29T17:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.