Mixture of Self-Supervised Learning
- URL: http://arxiv.org/abs/2307.14897v1
- Date: Thu, 27 Jul 2023 14:38:32 GMT
- Title: Mixture of Self-Supervised Learning
- Authors: Aristo Renaldo Ruslim, Novanto Yudistira, Budi Darma Setiawan
- Abstract summary: Self-supervised learning works by using a pretext task which will be trained on the model before being applied to a specific task.
Previous studies have only used one type of transformation as a pretext task.
This raises the question of how it affects if more than one pretext task is used and to use a gating network to combine all pretext tasks.
- Score: 2.191505742658975
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Self-supervised learning is popular method because of its ability to learn
features in images without using its labels and is able to overcome limited
labeled datasets used in supervised learning. Self-supervised learning works by
using a pretext task which will be trained on the model before being applied to
a specific task. There are some examples of pretext tasks used in
self-supervised learning in the field of image recognition, namely rotation
prediction, solving jigsaw puzzles, and predicting relative positions on image.
Previous studies have only used one type of transformation as a pretext task.
This raises the question of how it affects if more than one pretext task is
used and to use a gating network to combine all pretext tasks. Therefore, we
propose the Gated Self-Supervised Learning method to improve image
classification which use more than one transformation as pretext task and uses
the Mixture of Expert architecture as a gating network in combining each
pretext task so that the model automatically can study and focus more on the
most useful augmentations for classification. We test performance of the
proposed method in several scenarios, namely CIFAR imbalance dataset
classification, adversarial perturbations, Tiny-Imagenet dataset
classification, and semi-supervised learning. Moreover, there are Grad-CAM and
T-SNE analysis that are used to see the proposed method for identifying
important features that influence image classification and representing data
for each class and separating different classes properly. Our code is in
https://github.com/aristorenaldo/G-SSL
Related papers
- Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Learning Transferable Pedestrian Representation from Multimodal
Information Supervision [174.5150760804929]
VAL-PAT is a novel framework that learns transferable representations to enhance various pedestrian analysis tasks with multimodal information.
We first perform pre-training on LUPerson-TA dataset, where each image contains text and attribute annotations.
We then transfer the learned representations to various downstream tasks, including person reID, person attribute recognition and text-based person search.
arXiv Detail & Related papers (2023-04-12T01:20:58Z) - Self-Supervised Learning for Fine-Grained Image Classification [0.0]
Fine-grained datasets usually provide bounding box annotations along with class labels to aid the process of classification.
On the other hand, self-supervised learning exploits the freely available data to generate supervisory signals which act as labels.
Our idea is to leverage self-supervision such that the model learns useful representations of fine-grained image classes.
arXiv Detail & Related papers (2021-07-29T14:01:31Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Exploiting the relationship between visual and textual features in
social networks for image classification with zero-shot deep learning [0.0]
In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture.
Our experiments, based on image classification tasks according to the labels of the Places dataset, are performed by first considering only the visual part.
Considering the associated texts to the images can help to improve the accuracy depending on the goal.
arXiv Detail & Related papers (2021-07-08T10:54:59Z) - Learning to Focus: Cascaded Feature Matching Network for Few-shot Image
Recognition [38.49419948988415]
Deep networks can learn to accurately recognize objects of a category by training on a large number of images.
A meta-learning challenge known as a low-shot image recognition task comes when only a few images with annotations are available for learning a recognition model for one category.
Our method, called Cascaded Feature Matching Network (CFMN), is proposed to solve this problem.
Experiments for few-shot learning on two standard datasets, emphminiImageNet and Omniglot, have confirmed the effectiveness of our method.
arXiv Detail & Related papers (2021-01-13T11:37:28Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - SCAN: Learning to Classify Images without Labels [73.69513783788622]
We advocate a two-step approach where feature learning and clustering are decoupled.
A self-supervised task from representation learning is employed to obtain semantically meaningful features.
We obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime.
arXiv Detail & Related papers (2020-05-25T18:12:33Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.