Related papers: Ablation study of self-supervised learning for image classification

Ablation study of self-supervised learning for image classification

URL: http://arxiv.org/abs/2112.02297v1
Date: Sat, 4 Dec 2021 09:59:01 GMT
Title: Ablation study of self-supervised learning for image classification
Authors: Ilias Papastratis
Abstract summary: This project focuses on the self-supervised training of convolutional neural networks (CNNs) and transformer networks for the task of image recognition. A simple siamese network with different backbones is used in order to maximize the similarity of two augmented transformed images from the same source image.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This project focuses on the self-supervised training of convolutional neural networks (CNNs) and transformer networks for the task of image recognition. A simple siamese network with different backbones is used in order to maximize the similarity of two augmented transformed images from the same source image. In this way, the backbone is able to learn visual information without supervision. Finally, the method is evaluated on three image recognition datasets.

Related papers

HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation [106.09886920774002]
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network. Our method achieves consistent improvements over the baseline trained from scratch and significantly out- performs the existing schemes.
arXiv Detail & Related papers (2024-03-18T14:18:08Z)
Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers [48.74331852418905]
Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task's complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. We introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers.
arXiv Detail & Related papers (2024-03-11T10:48:56Z)
AbHE: All Attention-based Homography Estimation [0.0]
We propose a strong-baseline model based on the Swin Transformer, which combines convolution neural network for local features and transformer module for global features. In the homography regression stage, we adopt an attention layer for the channels of correlation volume, which can drop out some weak correlation feature points. The experiment shows that in 8 Degree-of-Freedoms(DOFs) homography estimation our method overperforms the state-of-the-art method.
arXiv Detail & Related papers (2022-12-06T15:00:00Z)
Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex. Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex. We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z)
LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image. We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion. We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z)
Fusion of evidential CNN classifiers for image classification [6.230751621285322]
We propose an information-fusion approach based on belief functions to combine convolutional neural networks. In this approach, several pre-trained DS-based CNN architectures extract features from input images and convert them into mass functions on different frames of discernment.
arXiv Detail & Related papers (2021-08-23T15:12:26Z)
Comparative evaluation of CNN architectures for Image Caption Generation [1.2183405753834562]
We have evaluated 17 different Convolutional Neural Networks on two popular Image Caption Generation frameworks. We observe that model complexity of Convolutional Neural Network, as measured by number of parameters, and the accuracy of the model on Object Recognition task does not necessarily co-relate with its efficacy on feature extraction for Image Caption Generation task.
arXiv Detail & Related papers (2021-02-23T05:43:54Z)
Mutual Information Maximization on Disentangled Representations for Differential Morph Detection [29.51265709271036]
We present a novel differential morph detection framework, utilizing landmark and appearance disentanglement. The proposed framework can provide state-of-the-art differential morph detection performance.
arXiv Detail & Related papers (2020-12-02T21:31:02Z)
NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior. We propose to search for neural architectures that capture stronger image priors. We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z)
Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets) Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network" Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
Image Retrieval using Multi-scale CNN Features Pooling [26.811290793232313]
We present an end-to-end trainable network architecture that exploits a novel multi-scale local pooling based on NetVLAD and a triplet mining procedure based on samples difficulty to obtain an effective image representation.
arXiv Detail & Related papers (2020-04-21T00:57:52Z)
Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning. Current contrastive models are ineffective at localizing the foreground object. We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.