Pretext-Contrastive Learning: Toward Good Practices in Self-supervised
Video Representation Leaning
- URL: http://arxiv.org/abs/2010.15464v2
- Date: Sun, 4 Apr 2021 14:42:00 GMT
- Title: Pretext-Contrastive Learning: Toward Good Practices in Self-supervised
Video Representation Leaning
- Authors: Li Tao, Xueting Wang, Toshihiko Yamasaki
- Abstract summary: We propose a joint optimization framework to boost both pretext task and contrastive learning.
It is convenient to treat PCL as a standard training strategy and apply it to many other works in self-supervised video feature learning.
- Score: 43.002621928500425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, pretext-task based methods are proposed one after another in
self-supervised video feature learning. Meanwhile, contrastive learning methods
also yield good performance. Usually, new methods can beat previous ones as
claimed that they could capture "better" temporal information. However, there
exist setting differences among them and it is hard to conclude which is
better. It would be much more convincing in comparison if these methods have
reached as closer to their performance limits as possible. In this paper, we
start from one pretext-task baseline, exploring how far it can go by combining
it with contrastive learning, data pre-processing, and data augmentation. A
proper setting has been found from extensive experiments, with which huge
improvements over the baselines can be achieved, indicating a joint
optimization framework can boost both pretext task and contrastive learning. We
denote the joint optimization framework as Pretext-Contrastive Learning (PCL).
The other two pretext task baselines are used to validate the effectiveness of
PCL. And we can easily outperform current state-of-the-art methods in the same
training manner, showing the effectiveness and the generality of our proposal.
It is convenient to treat PCL as a standard training strategy and apply it to
many other works in self-supervised video feature learning.
Related papers
- When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective [57.05315507519704]
We propose a log-likelihood ratio (LLR) approach to analyze the comparative benefits of visual prompting and linear probing.
Our measure attains up to a 100-fold reduction in run time compared to full training, while achieving prediction accuracies up to 91%.
arXiv Detail & Related papers (2024-09-03T12:03:45Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning [32.18543787821028]
This paper proposes an adaptive technique of batch fusion for self-supervised contrastive learning.
It achieves state-of-the-art performance under equitable comparisons.
We suggest that the proposed method may contribute to the advancement of data-driven self-supervised learning research.
arXiv Detail & Related papers (2023-11-16T15:47:49Z) - Improved baselines for vision-language pre-training [26.395527650984025]
We propose, implement and evaluate several baselines obtained by combining contrastive learning with self-supervised learning.
We find that these baselines outperform a basic implementation of CLIP.
We find that a simple CLIP baseline can also be improved substantially, up to a 25% relative improvement on downstream zero-shot tasks.
arXiv Detail & Related papers (2023-05-15T14:31:49Z) - Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance.
We develop a framework that permits data-dependent weighted cross-entropy losses.
Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z) - LEAVES: Learning Views for Time-Series Data in Contrastive Learning [16.84326709739788]
We propose a module for automating view generation for time-series data in contrastive learning, named learning views for time-series data (LEAVES)
The proposed method is more effective in finding reasonable views and performs downstream tasks better than the baselines.
arXiv Detail & Related papers (2022-10-13T20:18:22Z) - CUPID: Adaptive Curation of Pre-training Data for Video-and-Language
Representation Learning [49.18591896085498]
We propose CUPID to bridge the domain gap between source and target data.
CUPID yields new state-of-the-art performance across multiple video-language and video tasks.
arXiv Detail & Related papers (2021-04-01T06:42:16Z) - A Survey on Contrastive Self-supervised Learning [0.0]
Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets.
Contrastive learning has recently become a dominant component in self-supervised learning methods for computer vision, natural language processing (NLP), and other domains.
This paper provides an extensive review of self-supervised methods that follow the contrastive approach.
arXiv Detail & Related papers (2020-10-31T21:05:04Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.