Deformable Capsules for Object Detection
- URL: http://arxiv.org/abs/2104.05031v3
- Date: Sun, 28 Jul 2024 00:22:16 GMT
- Title: Deformable Capsules for Object Detection
- Authors: Rodney Lalonde, Naji Khosravan, Ulas Bagci,
- Abstract summary: We introduce a new family of capsule networks, deformable capsules (textitDeformCaps), to address a very important problem in computer vision: object detection.
We demonstrate that the proposed methods efficiently scale up to create the first-ever capsule network for object detection in the literature.
- Score: 3.702343116848637
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Capsule networks promise significant benefits over convolutional networks by storing stronger internal representations, and routing information based on the agreement between intermediate representations' projections. Despite this, their success has been limited to small-scale classification datasets due to their computationally expensive nature. Though memory efficient, convolutional capsules impose geometric constraints that fundamentally limit the ability of capsules to model the pose/deformation of objects. Further, they do not address the bigger memory concern of class-capsules scaling up to bigger tasks such as detection or large-scale classification. In this study, we introduce a new family of capsule networks, deformable capsules (\textit{DeformCaps}), to address a very important problem in computer vision: object detection. We propose two new algorithms associated with our \textit{DeformCaps}: a novel capsule structure (\textit{SplitCaps}), and a novel dynamic routing algorithm (\textit{SE-Routing}), which balance computational efficiency with the need for modeling a large number of objects and classes, which have never been achieved with capsule networks before. We demonstrate that the proposed methods efficiently scale up to create the first-ever capsule network for object detection in the literature. Our proposed architecture is a one-stage detection framework and it obtains results on MS COCO which are on par with state-of-the-art one-stage CNN-based methods, while producing fewer false positive detection, generalizing to unusual poses/viewpoints of objects.
Related papers
- A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets.
We train a small parametric corrector network that adjusts stale cached target embeddings.
Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z) - Hierarchical Object-Centric Learning with Capsule Networks [0.0]
Capsule networks (CapsNets) were introduced to address convolutional neural networks limitations.
This thesis investigates the intriguing aspects of CapsNets and focuses on three key questions to unlock their full potential.
arXiv Detail & Related papers (2024-05-30T09:10:33Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Affordance detection with Dynamic-Tree Capsule Networks [5.847547503155588]
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation.
We introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds.
Our algorithm is superior to current affordance detection methods when faced with grasping previously unseen objects.
arXiv Detail & Related papers (2022-11-09T21:14:08Z) - Hybrid Gromov-Wasserstein Embedding for Capsule Learning [24.520120182880333]
Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relations using a two-step process.
hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential advantages.
We present an efficient approach for learning capsules that surpasses canonical baseline models and even demonstrates superior performance compared to high-performing convolution models.
arXiv Detail & Related papers (2022-09-01T05:26:32Z) - Towards Efficient Capsule Networks [7.1577508803778045]
Capsule Networks were introduced to enhance explainability of a model, where each capsule is an explicit representation of an object or its parts.
We show how pruning with Capsule Network achieves high generalization with less memory requirements, computational effort, and inference and training time.
arXiv Detail & Related papers (2022-08-19T08:03:25Z) - Learning with Capsules: A Survey [73.31150426300198]
Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations.
Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships.
arXiv Detail & Related papers (2022-06-06T15:05:36Z) - Efficient-CapsNet: Capsule Network with Self-Attention Routing [0.0]
Deep convolutional neural networks make extensive use of data augmentation techniques and layers with a high number of feature maps to embed object transformations.
capsule networks are a promising solution to extend current convolutional networks and endow artificial visual perception with a process to encode more efficiently all feature affine transformations.
In this paper, we investigate the efficiency of capsule networks and, pushing their capacity to the limits with an extreme architecture with barely 160K parameters, we prove that the proposed architecture is still able to achieve state-of-the-art results.
arXiv Detail & Related papers (2021-01-29T09:56:44Z) - Wasserstein Routed Capsule Networks [90.16542156512405]
We propose a new parameter efficient capsule architecture, that is able to tackle complex tasks.
We show that our network is able to substantially outperform other capsule approaches by over 1.2 % on CIFAR-10.
arXiv Detail & Related papers (2020-07-22T14:38:05Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.