Iterative collaborative routing among equivariant capsules for
transformation-robust capsule networks
- URL: http://arxiv.org/abs/2210.11095v1
- Date: Thu, 20 Oct 2022 08:47:18 GMT
- Title: Iterative collaborative routing among equivariant capsules for
transformation-robust capsule networks
- Authors: Sai Raam Venkataraman, S. Balasubramanian, R. Raghunatha Sarma
- Abstract summary: We propose a capsule network model that is equivariant and compositionality-aware.
The awareness of compositionality comes from the use of our proposed novel, iterative, graph-based routing algorithm.
Experiments on transformed image classification on FashionMNIST, CIFAR-10, and CIFAR-100 show that our model that uses ICR outperforms convolutional and capsule baselines to achieve state-of-the-art performance.
- Score: 6.445605125467574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformation-robustness is an important feature for machine learning models
that perform image classification. Many methods aim to bestow this property to
models by the use of data augmentation strategies, while more formal guarantees
are obtained via the use of equivariant models. We recognise that
compositional, or part-whole structure is also an important aspect of images
that has to be considered for building transformation-robust models. Thus, we
propose a capsule network model that is, at once, equivariant and
compositionality-aware. Equivariance of our capsule network model comes from
the use of equivariant convolutions in a carefully-chosen novel architecture.
The awareness of compositionality comes from the use of our proposed novel,
iterative, graph-based routing algorithm, termed Iterative collaborative
routing (ICR). ICR, the core of our contribution, weights the predictions made
for capsules based on an iteratively averaged score of the degree-centralities
of its nearest neighbours. Experiments on transformed image classification on
FashionMNIST, CIFAR-10, and CIFAR-100 show that our model that uses ICR
outperforms convolutional and capsule baselines to achieve state-of-the-art
performance.
Related papers
- Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation [36.45488536471859]
Similarity refines the image-image similarity by using unlabeled images.
Weight introduces a precision matrix into the weight function to adequately model the relation between training samples.
To reduce the high complexity of GPs, we propose a group-based learning strategy.
arXiv Detail & Related papers (2024-10-11T15:12:30Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - Robustcaps: a transformation-robust capsule network for image
classification [6.445605125467574]
We present a deep neural network model that exhibits the desirable property of transformation-robustness.
Our model, termed RobustCaps, uses group-equivariant convolutions in an improved capsule network model.
It achieves state-of-the-art accuracies on CIFAR-10, FashionMNIST, and CIFAR-100 datasets.
arXiv Detail & Related papers (2022-10-20T08:42:33Z) - Contextformer: A Transformer with Spatio-Channel Attention for Context
Modeling in Learned Image Compression [5.152019611975467]
We propose a transformer-based context model ak.a. Contextformer.
We replace the context model of a modern compression framework with the Contextformer and test it on the widely used Kodak image dataset.
Our experimental results show that the proposed model provides up to 10% rate savings compared to the standard Versatile Video Coding (VVC) Test Model (VVC) 9.1.
arXiv Detail & Related papers (2022-03-04T17:29:32Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z) - Group Equivariant Generative Adversarial Networks [7.734726150561089]
In this work, we explicitly incorporate inductive symmetry priors into the network architectures via group-equivariant convolutional networks.
Group-convariants have higher expressive power with fewer samples and lead to better gradient feedback between generator and discriminator.
arXiv Detail & Related papers (2020-05-04T17:38:49Z) - Toward a Controllable Disentanglement Network [22.968760397814993]
This paper addresses two crucial problems of learning disentangled image representations, namely controlling the degree of disentanglement during image editing, and balancing the disentanglement strength and the reconstruction quality.
By exploring the real-valued space of the soft target representation, we are able to synthesize novel images with the designated properties.
arXiv Detail & Related papers (2020-01-22T16:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.