A New Split for Evaluating True Zero-Shot Action Recognition
- URL: http://arxiv.org/abs/2107.13029v1
- Date: Tue, 27 Jul 2021 18:22:39 GMT
- Title: A New Split for Evaluating True Zero-Shot Action Recognition
- Authors: Shreyank N Gowda, Laura Sevilla-Lara, Kiyoon Kim, Frank Keller, and
Marcus Rohrbach
- Abstract summary: We propose a new split for true zero-shot action recognition with no overlap between unseen test classes and training or pre-training classes.
We benchmark several recent approaches on the proposed True Zero-Shot (TruZe) Split for UCF101 and HMDB51.
- Score: 45.815342448662946
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot action recognition is the task of classifying action categories
that are not available in the training set. In this setting, the standard
evaluation protocol is to use existing action recognition datasets (e.g.
UCF101) and randomly split the classes into seen and unseen. However, most
recent work builds on representations pre-trained on the Kinetics dataset,
where classes largely overlap with classes in the zero-shot evaluation
datasets. As a result, classes which are supposed to be unseen, are present
during supervised pre-training, invalidating the condition of the zero-shot
setting. A similar concern was previously noted several years ago for image
based zero-shot recognition, but has not been considered by the zero-shot
action recognition community. In this paper, we propose a new split for true
zero-shot action recognition with no overlap between unseen test classes and
training or pre-training classes. We benchmark several recent approaches on the
proposed True Zero-Shot (TruZe) Split for UCF101 and HMDB51, with zero-shot and
generalized zero-shot evaluation. In our extensive analysis we find that our
TruZe splits are significantly harder than comparable random splits as nothing
is leaking from pre-training, i.e. unseen performance is consistently lower, up
to 9.4% for zero-shot action recognition. In an additional evaluation we also
find that similar issues exist in the splits used in few-shot action
recognition, here we see differences of up to 14.1%. We publish our splits and
hope that our benchmark analysis will change how the field is evaluating zero-
and few-shot action recognition moving forward.
Related papers
- CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification [11.225067563482169]
We provide a comprehensive document image classification analysis in Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) settings.
We introduce CICA (pronounced 'ki-ka'), a framework that enhances the zero-shot learning capabilities of CLIP.
Our module improves CLIP's ZSL top-1 accuracy by 6.7% and GZSL harmonic mean by 24% on the RVL-CDIP dataset.
arXiv Detail & Related papers (2024-05-06T17:37:23Z) - Few-shot Open-set Recognition Using Background as Unknowns [58.04165813493666]
Few-shot open-set recognition aims to classify both seen and novel images given only limited training data of seen classes.
Our proposed method not only outperforms multiple baselines but also sets new results on three popular benchmarks.
arXiv Detail & Related papers (2022-07-19T04:19:29Z) - Zero-Shot Action Recognition with Transformer-based Video Semantic
Embedding [36.24563211765782]
We take a new comprehensive look at the inductive zero-shot action recognition problem from a realistic standpoint.
Specifically, we advocate for a concrete formulation for zero-shot action recognition that avoids an exact overlap between the training and testing classes.
We propose a novel end-to-end trained transformer model which is capable of capturing long rangetemporal dependencies efficiently.
arXiv Detail & Related papers (2022-03-10T05:03:58Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z) - Unifying Few- and Zero-Shot Egocentric Action Recognition [3.1368611610608848]
We propose a new set of splits derived from the EPIC-KITCHENS dataset that allow evaluation of open-set classification.
We show that adding a metric-learning loss to the conventional direct-alignment baseline can improve zero-shot classification by as much as 10%.
arXiv Detail & Related papers (2020-05-27T02:23:38Z) - Revisiting Few-shot Activity Detection with Class Similarity Control [107.79338380065286]
We present a framework for few-shot temporal activity detection based on proposal regression.
Our model is end-to-end trainable, takes into account the frame rate differences between few-shot activities and untrimmed test videos, and can benefit from additional few-shot examples.
arXiv Detail & Related papers (2020-03-31T22:02:38Z) - Any-Shot Object Detection [81.88153407655334]
'Any-shot detection' is where totally unseen and few-shot categories can simultaneously co-occur during inference.
We propose a unified any-shot detection model, that can concurrently learn to detect both zero-shot and few-shot object classes.
Our framework can also be used solely for Zero-shot detection and Few-shot detection tasks.
arXiv Detail & Related papers (2020-03-16T03:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.