Deformable Capsules for Object Detection
- URL: http://arxiv.org/abs/2104.05031v1
- Date: Sun, 11 Apr 2021 15:36:30 GMT
- Title: Deformable Capsules for Object Detection
- Authors: Rodney Lalonde, Naji Khosravan, Ulas Bagci
- Abstract summary: We introduce deformable capsules (DeformCaps), a new capsule structure (SplitCaps), and a novel dynamic routing algorithm (SE-Routing) to balance computational efficiency with the need for modeling a large number of objects and classes.
Our proposed architecture is a one-stage detection framework and obtains results on MS COCO which are on-par with state-of-the-art one-stage CNN-based methods.
- Score: 5.819237403145079
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Capsule networks promise significant benefits over convolutional networks by
storing stronger internal representations, and routing information based on the
agreement between intermediate representations' projections. Despite this,
their success has been mostly limited to small-scale classification datasets
due to their computationally expensive nature. Recent studies have partially
overcome this burden by locally-constraining the dynamic routing of features
with convolutional capsules. Though memory efficient, convolutional capsules
impose geometric constraints which fundamentally limit the ability of capsules
to model the pose/deformation of objects. Further, they do not address the
bigger memory concern of class-capsules scaling-up to bigger tasks such as
detection or large-scale classification. In this study, we introduce deformable
capsules (DeformCaps), a new capsule structure (SplitCaps), and a novel dynamic
routing algorithm (SE-Routing) to balance computational efficiency with the
need for modeling a large number of objects and classes. We demonstrate that
the proposed methods allow capsules to efficiently scale-up to large-scale
computer vision tasks for the first time, and create the first-ever capsule
network for object detection in the literature. Our proposed architecture is a
one-stage detection framework and obtains results on MS COCO which are on-par
with state-of-the-art one-stage CNN-based methods, while producing fewer false
positive detections.
Related papers
- A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets.
We train a small parametric corrector network that adjusts stale cached target embeddings.
Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z) - Hierarchical Object-Centric Learning with Capsule Networks [0.0]
Capsule networks (CapsNets) were introduced to address convolutional neural networks limitations.
This thesis investigates the intriguing aspects of CapsNets and focuses on three key questions to unlock their full potential.
arXiv Detail & Related papers (2024-05-30T09:10:33Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Affordance detection with Dynamic-Tree Capsule Networks [5.847547503155588]
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation.
We introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds.
Our algorithm is superior to current affordance detection methods when faced with grasping previously unseen objects.
arXiv Detail & Related papers (2022-11-09T21:14:08Z) - Hybrid Gromov-Wasserstein Embedding for Capsule Learning [24.520120182880333]
Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relations using a two-step process.
hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential advantages.
We present an efficient approach for learning capsules that surpasses canonical baseline models and even demonstrates superior performance compared to high-performing convolution models.
arXiv Detail & Related papers (2022-09-01T05:26:32Z) - Towards Efficient Capsule Networks [7.1577508803778045]
Capsule Networks were introduced to enhance explainability of a model, where each capsule is an explicit representation of an object or its parts.
We show how pruning with Capsule Network achieves high generalization with less memory requirements, computational effort, and inference and training time.
arXiv Detail & Related papers (2022-08-19T08:03:25Z) - Learning with Capsules: A Survey [73.31150426300198]
Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations.
Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships.
arXiv Detail & Related papers (2022-06-06T15:05:36Z) - Efficient-CapsNet: Capsule Network with Self-Attention Routing [0.0]
Deep convolutional neural networks make extensive use of data augmentation techniques and layers with a high number of feature maps to embed object transformations.
capsule networks are a promising solution to extend current convolutional networks and endow artificial visual perception with a process to encode more efficiently all feature affine transformations.
In this paper, we investigate the efficiency of capsule networks and, pushing their capacity to the limits with an extreme architecture with barely 160K parameters, we prove that the proposed architecture is still able to achieve state-of-the-art results.
arXiv Detail & Related papers (2021-01-29T09:56:44Z) - Wasserstein Routed Capsule Networks [90.16542156512405]
We propose a new parameter efficient capsule architecture, that is able to tackle complex tasks.
We show that our network is able to substantially outperform other capsule approaches by over 1.2 % on CIFAR-10.
arXiv Detail & Related papers (2020-07-22T14:38:05Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Subspace Capsule Network [85.69796543499021]
SubSpace Capsule Network (SCN) exploits the idea of capsule networks to model possible variations in the appearance or implicitly defined properties of an entity.
SCN can be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time.
arXiv Detail & Related papers (2020-02-07T17:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.