TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
- URL: http://arxiv.org/abs/2004.10141v2
- Date: Sat, 17 Jul 2021 10:54:57 GMT
- Title: TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
- Authors: Rami Ben-Ari, Mor Shpigel, Ophir Azulai, Udi Barzelay and Daniel
Rotman
- Abstract summary: We present Aware Temporal Embedding Network (TAEN) for few-shot action recognition.
We demonstrate the effectiveness of TAEN on two few shot tasks, video classification and temporal action detection.
With training of just a few fully connected layers we reach comparable results to prior art on both few shot video classification and temporal detection tasks.
- Score: 10.07962673311661
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Classification of new class entities requires collecting and annotating
hundreds or thousands of samples that is often prohibitively costly. Few-shot
learning suggests learning to classify new classes using just a few examples.
Only a small number of studies address the challenge of few-shot learning on
spatio-temporal patterns such as videos. In this paper, we present the Temporal
Aware Embedding Network (TAEN) for few-shot action recognition, that learns to
represent actions, in a metric space as a trajectory, conveying both short term
semantics and longer term connectivity between action parts. We demonstrate the
effectiveness of TAEN on two few shot tasks, video classification and temporal
action detection and evaluate our method on the Kinetics-400 and on ActivityNet
1.2 few-shot benchmarks. With training of just a few fully connected layers we
reach comparable results to prior art on both few shot video classification and
temporal detection tasks, while reaching state-of-the-art in certain scenarios.
Related papers
- A Comprehensive Review of Few-shot Action Recognition [64.47305887411275]
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data.
It requires accurately classifying human actions in videos using only a few labeled examples per class.
arXiv Detail & Related papers (2024-07-20T03:53:32Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - Few-Shot Action Localization without Knowing Boundaries [9.959844922120523]
We show that it is possible to learn to localize actions in untrimmed videos when only one/few trimmed examples of the target action are available at test time.
We propose a network that learns to estimate Temporal Similarity Matrices (TSMs) that model a fine-grained similarity pattern between pairs of videos.
Our method achieves performance comparable or better to state-of-the-art fully-supervised, few-shot learning methods.
arXiv Detail & Related papers (2021-06-08T07:32:43Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Few-Shot Image Classification via Contrastive Self-Supervised Learning [5.878021051195956]
We propose a new paradigm of unsupervised few-shot learning to repair the deficiencies.
We solve the few-shot tasks in two phases: meta-training a transferable feature extractor via contrastive self-supervised learning.
Our method achieves state of-the-art performance in a variety of established few-shot tasks on the standard few-shot visual classification datasets.
arXiv Detail & Related papers (2020-08-23T02:24:31Z) - Generalized Few-Shot Video Classification with Video Retrieval and
Feature Generation [132.82884193921535]
We argue that previous methods underestimate the importance of video feature learning and propose a two-stage approach.
We show that this simple baseline approach outperforms prior few-shot video classification methods by over 20 points on existing benchmarks.
We present two novel approaches that yield further improvement.
arXiv Detail & Related papers (2020-07-09T13:05:32Z) - Revisiting Few-shot Activity Detection with Class Similarity Control [107.79338380065286]
We present a framework for few-shot temporal activity detection based on proposal regression.
Our model is end-to-end trainable, takes into account the frame rate differences between few-shot activities and untrimmed test videos, and can benefit from additional few-shot examples.
arXiv Detail & Related papers (2020-03-31T22:02:38Z) - Any-Shot Object Detection [81.88153407655334]
'Any-shot detection' is where totally unseen and few-shot categories can simultaneously co-occur during inference.
We propose a unified any-shot detection model, that can concurrently learn to detect both zero-shot and few-shot object classes.
Our framework can also be used solely for Zero-shot detection and Few-shot detection tasks.
arXiv Detail & Related papers (2020-03-16T03:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.