Robustcaps: a transformation-robust capsule network for image
classification
- URL: http://arxiv.org/abs/2210.11092v1
- Date: Thu, 20 Oct 2022 08:42:33 GMT
- Title: Robustcaps: a transformation-robust capsule network for image
classification
- Authors: Sai Raam Venkataraman, S. Balasubramanian, R. Raghunatha Sarma
- Abstract summary: We present a deep neural network model that exhibits the desirable property of transformation-robustness.
Our model, termed RobustCaps, uses group-equivariant convolutions in an improved capsule network model.
It achieves state-of-the-art accuracies on CIFAR-10, FashionMNIST, and CIFAR-100 datasets.
- Score: 6.445605125467574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Geometric transformations of the training data as well as the test data
present challenges to the use of deep neural networks to vision-based learning
tasks. In order to address this issue, we present a deep neural network model
that exhibits the desirable property of transformation-robustness. Our model,
termed RobustCaps, uses group-equivariant convolutions in an improved capsule
network model. RobustCaps uses a global context-normalised procedure in its
routing algorithm to learn transformation-invariant part-whole relationships
within image data. This learning of such relationships allows our model to
outperform both capsule and convolutional neural network baselines on
transformation-robust classification tasks. Specifically, RobustCaps achieves
state-of-the-art accuracies on CIFAR-10, FashionMNIST, and CIFAR-100 when the
images in these datasets are subjected to train and test-time rotations and
translations.
Related papers
- TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Convolutional Neural Generative Coding: Scaling Predictive Coding to
Natural Images [79.07468367923619]
We develop convolutional neural generative coding (Conv-NGC)
We implement a flexible neurobiologically-motivated algorithm that progressively refines latent state maps.
We study the effectiveness of our brain-inspired neural system on the tasks of reconstruction and image denoising.
arXiv Detail & Related papers (2022-11-22T06:42:41Z) - Iterative collaborative routing among equivariant capsules for
transformation-robust capsule networks [6.445605125467574]
We propose a capsule network model that is equivariant and compositionality-aware.
The awareness of compositionality comes from the use of our proposed novel, iterative, graph-based routing algorithm.
Experiments on transformed image classification on FashionMNIST, CIFAR-10, and CIFAR-100 show that our model that uses ICR outperforms convolutional and capsule baselines to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-10-20T08:47:18Z) - Convolutional Analysis Operator Learning by End-To-End Training of
Iterative Neural Networks [3.6280929178575994]
We show how convolutional sparsifying filters can be efficiently learned by end-to-end training of iterative neural networks.
We evaluated our approach on a non-Cartesian 2D cardiac cine MRI example and show that the obtained filters are better suitable for the corresponding reconstruction algorithm than the ones obtained by decoupled pre-training.
arXiv Detail & Related papers (2022-03-04T07:32:16Z) - Feature-level augmentation to improve robustness of deep neural networks
to affine transformations [22.323625542814284]
Recent studies revealed that convolutional neural networks do not generalize well to small image transformations.
We propose to introduce data augmentation at intermediate layers of the neural architecture.
We develop the capacity of the neural network to cope with such transformations.
arXiv Detail & Related papers (2022-02-10T17:14:58Z) - TransformNet: Self-supervised representation learning through predicting
geometric transformations [0.8098097078441623]
We describe the unsupervised semantic feature learning approach for recognition of the geometric transformation applied to the input data.
The basic concept of our approach is that if someone is unaware of the objects in the images, he/she would not be able to quantitatively predict the geometric transformation that was applied to them.
arXiv Detail & Related papers (2022-02-08T22:41:01Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Probabilistic Spatial Transformer Networks [0.6999740786886537]
We propose a probabilistic extension that estimates a transformation rather than a deterministic one.
We show that these two properties lead to improved classification performance, robustness and model calibration.
We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data.
arXiv Detail & Related papers (2020-04-07T18:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.