Self-supervised Co-training for Video Representation Learning
        - URL: http://arxiv.org/abs/2010.09709v2
- Date: Mon, 11 Jan 2021 20:53:18 GMT
- Title: Self-supervised Co-training for Video Representation Learning
- Authors: Tengda Han, Weidi Xie, Andrew Zisserman
- Abstract summary: We investigate the benefit of adding semantic-class positives to instance-based Info Noise Contrastive Estimation training.
We propose a novel self-supervised co-training scheme to improve the popular infoNCE loss.
We evaluate the quality of the learnt representation on two different downstream tasks: action recognition and video retrieval.
- Score: 103.69904379356413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   The objective of this paper is visual-only self-supervised video
representation learning. We make the following contributions: (i) we
investigate the benefit of adding semantic-class positives to instance-based
Info Noise Contrastive Estimation (InfoNCE) training, showing that this form of
supervised contrastive learning leads to a clear improvement in performance;
(ii) we propose a novel self-supervised co-training scheme to improve the
popular infoNCE loss, exploiting the complementary information from different
views, RGB streams and optical flow, of the same data source by using one view
to obtain positive class samples for the other; (iii) we thoroughly evaluate
the quality of the learnt representation on two different downstream tasks:
action recognition and video retrieval. In both cases, the proposed approach
demonstrates state-of-the-art or comparable performance with other
self-supervised approaches, whilst being significantly more efficient to train,
i.e. requiring far less training data to achieve similar performance.
 
      
        Related papers
        - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
 In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
 arXiv  Detail & Related papers  (2024-02-02T13:31:17Z)
- Adversarial Augmentation Training Makes Action Recognition Models More
  Robust to Realistic Video Distribution Shifts [13.752169303624147]
 Action recognition models often lack robustness when faced with natural distribution shifts between training and test data.
We propose two novel evaluation methods to assess model resilience to such distribution disparity.
We experimentally demonstrate the superior performance of the proposed adversarial augmentation approach over baselines across three state-of-the-art action recognition models.
 arXiv  Detail & Related papers  (2024-01-21T05:50:39Z)
- Revisiting Self-supervised Learning of Speech Representation from a
  Mutual Information Perspective [68.20531518525273]
 We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective.
We use linear probes to estimate the mutual information between the target information and learned representations.
We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
 arXiv  Detail & Related papers  (2024-01-16T21:13:22Z)
- From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning [32.18543787821028]
 This paper proposes an adaptive technique of batch fusion for self-supervised contrastive learning.
It achieves state-of-the-art performance under equitable comparisons.
We suggest that the proposed method may contribute to the advancement of data-driven self-supervised learning research.
 arXiv  Detail & Related papers  (2023-11-16T15:47:49Z)
- ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
 We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
 arXiv  Detail & Related papers  (2023-06-16T21:51:04Z)
- Additional Positive Enables Better Representation Learning for Medical
  Images [17.787804928943057]
 This paper presents a new way to identify additional positive pairs for BYOL, a state-of-the-art (SOTA) self-supervised learning framework.
For each image, we select the most similar sample from other images as the additional positive and pull their features together with BYOL loss.
 Experimental results on two public medical datasets demonstrate that the proposed method can improve the classification performance.
 arXiv  Detail & Related papers  (2023-05-31T18:37:02Z)
- Contrastive Learning from Demonstrations [0.0]
 We show that these representations are applicable for imitating several robotic tasks, including pick and place.
We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information.
 arXiv  Detail & Related papers  (2022-01-30T13:36:07Z)
- Co$^2$L: Contrastive Continual Learning [69.46643497220586]
 Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks.
We propose a rehearsal-based continual learning algorithm that focuses on continually learning and maintaining transferable representations.
 arXiv  Detail & Related papers  (2021-06-28T06:14:38Z)
- Memory-augmented Dense Predictive Coding for Video Representation
  Learning [103.69904379356413]
 We propose a new architecture and learning framework Memory-augmented Predictive Coding (MemDPC) for the task.
We investigate visual-only self-supervised video representation learning from RGB frames, or from unsupervised optical flow, or both.
In all cases, we demonstrate state-of-the-art or comparable performance over other approaches with orders of magnitude fewer training data.
 arXiv  Detail & Related papers  (2020-08-03T17:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.