Related papers: Self-supervised Feature Enhancement: Applying Internal Pretext Task to Supervised Learning

Self-supervised Feature Enhancement: Applying Internal Pretext Task to Supervised Learning

URL: http://arxiv.org/abs/2106.04921v1
Date: Wed, 9 Jun 2021 08:59:35 GMT
Title: Self-supervised Feature Enhancement: Applying Internal Pretext Task to Supervised Learning
Authors: Yuhang Yang, Zilin Ding, Xuan Cheng, Xiaomin Wang, Ming Liu
Abstract summary: We show that feature transformations within CNNs can also be regarded as supervisory signals to construct the self-supervised task. Specifically, we first transform the internal feature maps by discarding different channels, and then define an additional internal pretext task to identify the discarded channels. CNNs are trained to predict the joint labels generated by the combination of self-supervised labels and original labels.
Score: 6.508466234920147
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional self-supervised learning requires CNNs using external pretext tasks (i.e., image- or video-based tasks) to encode high-level semantic visual representations. In this paper, we show that feature transformations within CNNs can also be regarded as supervisory signals to construct the self-supervised task, called \emph{internal pretext task}. And such a task can be applied for the enhancement of supervised learning. Specifically, we first transform the internal feature maps by discarding different channels, and then define an additional internal pretext task to identify the discarded channels. CNNs are trained to predict the joint labels generated by the combination of self-supervised labels and original labels. By doing so, we let CNNs know which channels are missing while classifying in the hope to mine richer feature information. Extensive experiments show that our approach is effective on various models and datasets. And it's worth noting that we only incur negligible computational overhead. Furthermore, our approach can also be compatible with other methods to get better results.

Related papers

Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation [8.075796717801985]
Humans remain vastly superior to CNNs in visual tasks involving relations. We show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning.
arXiv Detail & Related papers (2025-03-29T20:24:23Z)
CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples. Existing methods in video action recognition rely on large labeled datasets from the same domain. We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z)
SERE: Exploring Feature Self-relation for Self-supervised Transformer [79.5769147071757]
Vision transformers (ViT) have strong representation ability with spatial self-attention and channel-level feedforward networks. Recent works reveal that self-supervised learning helps unleash the great potential of ViT. We observe that relational modeling on spatial and channel dimensions distinguishes ViT from other networks.
arXiv Detail & Related papers (2022-06-10T15:25:00Z)
Self-supervision of Feature Transformation for Further Improving Supervised Learning [6.508466234920147]
We find that features in CNNs can be also used for self-supervision. In our task we discard different particular regions of features, and then train the model to distinguish these different features. Original labels will be expanded to joint labels via self-supervision of feature transformations.
arXiv Detail & Related papers (2021-06-09T09:06:33Z)
Wider Vision: Enriching Convolutional Neural Networks via Alignment to External Knowledge Bases [0.3867363075280543]
We aim to explain and expand CNNs models via the mirroring or alignment of CNN to an external knowledge base. This will allow us to give a semantic context or label for each visual feature. Our results show that in the aligned embedding space, nodes from the knowledge graph are close to the CNN feature nodes that have similar meanings.
arXiv Detail & Related papers (2021-02-22T16:00:03Z)
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer. Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z)
A CNN-based Feature Space for Semi-supervised Incremental Learning in Assisted Living Applications [2.1485350418225244]
We propose using the feature space that results from the training dataset to automatically label problematic images. The resulting semi-supervised incremental learning process allows improving the classification accuracy of new instances by 40%.
arXiv Detail & Related papers (2020-11-11T12:31:48Z)
Predicting What You Already Know Helps: Provable Self-Supervised Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data. We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation. We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z)
Decoding CNN based Object Classifier Using Visualization [6.666597301197889]
We visualize what type of features are extracted in different convolution layers of CNN. Visualizing heat map of activation helps us to understand how CNN classifies and localizes different objects in image.
arXiv Detail & Related papers (2020-07-15T05:01:27Z)
How Useful is Self-Supervised Pretraining for Visual Tasks? [133.1984299177874]
We evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks. Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows.
arXiv Detail & Related papers (2020-03-31T16:03:22Z)
Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation. We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters. As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.