Unsupervised Video Representation Learning by Bidirectional Feature
Prediction
- URL: http://arxiv.org/abs/2011.06037v1
- Date: Wed, 11 Nov 2020 19:42:31 GMT
- Title: Unsupervised Video Representation Learning by Bidirectional Feature
Prediction
- Authors: Nadine Behrmann and Juergen Gall and Mehdi Noroozi
- Abstract summary: This paper introduces a novel method for self-supervised video representation learning via feature prediction.
We argue that a supervisory signal arising from unobserved past frames is complementary to one that originates from the future frames.
We empirically show that utilizing both signals enriches the learned representations for the downstream task of action recognition.
- Score: 16.074111448606512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel method for self-supervised video representation
learning via feature prediction. In contrast to the previous methods that focus
on future feature prediction, we argue that a supervisory signal arising from
unobserved past frames is complementary to one that originates from the future
frames. The rationale behind our method is to encourage the network to explore
the temporal structure of videos by distinguishing between future and past
given present observations. We train our model in a contrastive learning
framework, where joint encoding of future and past provides us with a
comprehensive set of temporal hard negatives via swapping. We empirically show
that utilizing both signals enriches the learned representations for the
downstream task of action recognition. It outperforms independent prediction of
future and past.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.