Capsule Network based Contrastive Learning of Unsupervised Visual
Representations
- URL: http://arxiv.org/abs/2209.11276v1
- Date: Thu, 22 Sep 2022 19:05:27 GMT
- Title: Capsule Network based Contrastive Learning of Unsupervised Visual
Representations
- Authors: Harsh Panwar, Ioannis Patras
- Abstract summary: Contrastive Capsule (CoCa) Model is a Siamese style Capsule Network using Contrastive loss with our novel architecture, training and testing algorithm.
We evaluate the model on unsupervised image classification CIFAR-10 dataset and achieve a top-1 test accuracy of 70.50% and top-5 test accuracy of 98.10%.
Due to our efficient architecture our model has 31 times less parameters and 71 times less FLOPs than the current SOTA in both supervised and unsupervised learning.
- Score: 13.592112044121683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Capsule Networks have shown tremendous advancement in the past decade,
outperforming the traditional CNNs in various task due to it's equivariant
properties. With the use of vector I/O which provides information of both
magnitude and direction of an object or it's part, there lies an enormous
possibility of using Capsule Networks in unsupervised learning environment for
visual representation tasks such as multi class image classification. In this
paper, we propose Contrastive Capsule (CoCa) Model which is a Siamese style
Capsule Network using Contrastive loss with our novel architecture, training
and testing algorithm. We evaluate the model on unsupervised image
classification CIFAR-10 dataset and achieve a top-1 test accuracy of 70.50% and
top-5 test accuracy of 98.10%. Due to our efficient architecture our model has
31 times less parameters and 71 times less FLOPs than the current SOTA in both
supervised and unsupervised learning.
Related papers
- Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification [0.0]
We first pre-train the model with self-supervision to enable it to learn common feature expressions on a large amount of unlabeled data.
Then fine-tune it on the few-shot dataset Mini-ImageNet to improve the model's accuracy and generalization ability under limited data.
arXiv Detail & Related papers (2024-11-19T01:01:56Z) - Masked Capsule Autoencoders [5.363623643280699]
We propose Masked Capsule Autoencoders (MCAE), the first Capsule Network that utilises pretraining in a self-supervised manner.
Our proposed MCAE model alleviates this issue by reformulating the Capsule Network to use masked image modelling as a pretraining stage.
We demonstrate that similarly to CNNs and ViTs, Capsule Networks can also benefit from self-supervised pretraining.
arXiv Detail & Related papers (2024-03-07T18:22:03Z) - SeiT++: Masked Token Modeling Improves Storage-efficient Training [36.95646819348317]
Recent advancements in Deep Neural Network (DNN) models have significantly improved performance across computer vision tasks.
achieving highly generalizable and high-performing vision models requires expansive datasets, resulting in significant storage requirements.
Recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i.e., tokens) as network inputs for vision classification.
In this paper, we extend SeiT by integrating Masked Token Modeling (MTM) for self-supervised pre-training.
arXiv Detail & Related papers (2023-12-15T04:11:34Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - Large Neural Networks Learning from Scratch with Very Few Data and
without Regularization [0.0]
We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples.
VGG19 with 140 million weights learns to distinguish airplanes and motorbikes up to 95% accuracy with only 20 samples per class.
arXiv Detail & Related papers (2022-05-18T10:08:28Z) - Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue.
WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z) - Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution.
First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers.
Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z) - A Simple Framework for Contrastive Learning of Visual Representations [116.37752766922407]
This paper presents SimCLR: a simple framework for contrastive learning of visual representations.
We show that composition of data augmentations plays a critical role in defining effective predictive tasks.
We are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet.
arXiv Detail & Related papers (2020-02-13T18:50:45Z) - Identifying and Compensating for Feature Deviation in Imbalanced Deep
Learning [59.65752299209042]
We investigate learning a ConvNet under such a scenario.
We found that a ConvNet significantly over-fits the minor classes.
We propose to incorporate class-dependent temperatures (CDT) training ConvNet.
arXiv Detail & Related papers (2020-01-06T03:52:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.