Ladder Siamese Network: a Method and Insights for Multi-level
Self-Supervised Learning
- URL: http://arxiv.org/abs/2211.13844v1
- Date: Fri, 25 Nov 2022 00:49:25 GMT
- Title: Ladder Siamese Network: a Method and Insights for Multi-level
Self-Supervised Learning
- Authors: Ryota Yoshihashi, Shuhei Nishimura, Dai Yonebayashi, Yuya Otsuka,
Tomohiro Tanaka, Takashi Miyazaki
- Abstract summary: Siamese-network-based self-supervised learning (SSL) suffers from slow convergence and instability in training.
We propose a framework to exploit intermediate self-supervisions in each stage of deep nets, called the Ladder Siamese Network.
- Score: 9.257121691188008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Siamese-network-based self-supervised learning (SSL) suffers from slow
convergence and instability in training. To alleviate this, we propose a
framework to exploit intermediate self-supervisions in each stage of deep nets,
called the Ladder Siamese Network. Our self-supervised losses encourage the
intermediate layers to be consistent with different data augmentations to
single samples, which facilitates training progress and enhances the
discriminative ability of the intermediate layers themselves. While some
existing work has already utilized multi-level self supervisions in SSL, ours
is different in that 1) we reveal its usefulness with non-contrastive Siamese
frameworks in both theoretical and empirical viewpoints, and 2) ours improves
image-level classification, instance-level detection, and pixel-level
segmentation simultaneously. Experiments show that the proposed framework can
improve BYOL baselines by 1.0% points in ImageNet linear classification, 1.2%
points in COCO detection, and 3.1% points in PASCAL VOC segmentation. In
comparison with the state-of-the-art methods, our Ladder-based model achieves
competitive and balanced performances in all tested benchmarks without causing
large degradation in one.
Related papers
- Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation [21.345548821276097]
Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels.
We propose an end-to-end WSSS model incorporating guided CAMs, wherein our segmentation model is trained while concurrently optimizing CAMs online.
CoSA is the first single-stage approach to outperform all existing multi-stage methods including those with additional supervision.
arXiv Detail & Related papers (2024-02-27T21:08:23Z) - Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance.
We develop a framework that permits data-dependent weighted cross-entropy losses.
Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z) - Few-Shot Classification with Contrastive Learning [10.236150550121163]
We propose a novel contrastive learning-based framework that seamlessly integrates contrastive learning into both stages.
In the meta-training stage, we propose a cross-view episodic training mechanism to perform the nearest centroid classification on two different views of the same episode.
These two strategies force the model to overcome the bias between views and promote the transferability of representations.
arXiv Detail & Related papers (2022-09-17T02:39:09Z) - Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z) - CoSSL: Co-Learning of Representation and Classifier for Imbalanced
Semi-Supervised Learning [98.89092930354273]
We propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.
To handle the data imbalance, we devise Tail-class Feature Enhancement (TFE) for classifier learning.
In experiments, we show that our approach outperforms other methods over a large range of shifted distributions.
arXiv Detail & Related papers (2021-12-08T20:13:13Z) - Self-Distilled Self-Supervised Representation Learning [35.60243157730165]
State-of-the-art frameworks in self-supervised learning have recently shown that fully utilizing transformer-based models can lead to performance boost.
In our work, we further exploit this by allowing the intermediate representations to learn from the final layers via the contrastive loss.
Our method, Self-Distilled Self-Supervised Learning (SDSSL), outperforms competitive baselines (SimCLR, BYOL and MoCo v3) using ViT on various tasks and datasets.
arXiv Detail & Related papers (2021-11-25T07:52:36Z) - Weakly Supervised Person Search with Region Siamese Networks [65.76237418040071]
Supervised learning is dominant in person search, but it requires elaborate labeling of bounding boxes and identities.
We present a weakly supervised setting where only bounding box annotations are available.
Our model achieves the rank-1 of 87.1% and mAP of 86.0% on CUHK-SYSU benchmark.
arXiv Detail & Related papers (2021-09-13T16:33:27Z) - Semi-supervised Contrastive Learning with Similarity Co-calibration [72.38187308270135]
We propose a novel training strategy, termed as Semi-supervised Contrastive Learning (SsCL)
SsCL combines the well-known contrastive loss in self-supervised learning with the cross entropy loss in semi-supervised learning.
We show that SsCL produces more discriminative representation and is beneficial to few shot learning.
arXiv Detail & Related papers (2021-05-16T09:13:56Z) - Dense Contrastive Learning for Self-Supervised Visual Pre-Training [102.15325936477362]
We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only 1% slower)
arXiv Detail & Related papers (2020-11-18T08:42:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.