Ablation study of self-supervised learning for image classification
- URL: http://arxiv.org/abs/2112.02297v1
- Date: Sat, 4 Dec 2021 09:59:01 GMT
- Title: Ablation study of self-supervised learning for image classification
- Authors: Ilias Papastratis
- Abstract summary: This project focuses on the self-supervised training of convolutional neural networks (CNNs) and transformer networks for the task of image recognition.
A simple siamese network with different backbones is used in order to maximize the similarity of two augmented transformed images from the same source image.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This project focuses on the self-supervised training of convolutional neural
networks (CNNs) and transformer networks for the task of image recognition. A
simple siamese network with different backbones is used in order to maximize
the similarity of two augmented transformed images from the same source image.
In this way, the backbone is able to learn visual information without
supervision. Finally, the method is evaluated on three image recognition
datasets.
Related papers
- HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation [106.09886920774002]
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network.
Our method achieves consistent improvements over the baseline trained from scratch and significantly out- performs the existing schemes.
arXiv Detail & Related papers (2024-03-18T14:18:08Z) - AbHE: All Attention-based Homography Estimation [0.0]
We propose a strong-baseline model based on the Swin Transformer, which combines convolution neural network for local features and transformer module for global features.
In the homography regression stage, we adopt an attention layer for the channels of correlation volume, which can drop out some weak correlation feature points.
The experiment shows that in 8 Degree-of-Freedoms(DOFs) homography estimation our method overperforms the state-of-the-art method.
arXiv Detail & Related papers (2022-12-06T15:00:00Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Fusion of evidential CNN classifiers for image classification [6.230751621285322]
We propose an information-fusion approach based on belief functions to combine convolutional neural networks.
In this approach, several pre-trained DS-based CNN architectures extract features from input images and convert them into mass functions on different frames of discernment.
arXiv Detail & Related papers (2021-08-23T15:12:26Z) - Comparative evaluation of CNN architectures for Image Caption Generation [1.2183405753834562]
We have evaluated 17 different Convolutional Neural Networks on two popular Image Caption Generation frameworks.
We observe that model complexity of Convolutional Neural Network, as measured by number of parameters, and the accuracy of the model on Object Recognition task does not necessarily co-relate with its efficacy on feature extraction for Image Caption Generation task.
arXiv Detail & Related papers (2021-02-23T05:43:54Z) - Mutual Information Maximization on Disentangled Representations for
Differential Morph Detection [29.51265709271036]
We present a novel differential morph detection framework, utilizing landmark and appearance disentanglement.
The proposed framework can provide state-of-the-art differential morph detection performance.
arXiv Detail & Related papers (2020-12-02T21:31:02Z) - NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior.
We propose to search for neural architectures that capture stronger image priors.
We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z) - Image Retrieval using Multi-scale CNN Features Pooling [26.811290793232313]
We present an end-to-end trainable network architecture that exploits a novel multi-scale local pooling based on NetVLAD and a triplet mining procedure based on samples difficulty to obtain an effective image representation.
arXiv Detail & Related papers (2020-04-21T00:57:52Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.