Revisiting Few-shot Activity Detection with Class Similarity Control
- URL: http://arxiv.org/abs/2004.00137v1
- Date: Tue, 31 Mar 2020 22:02:38 GMT
- Title: Revisiting Few-shot Activity Detection with Class Similarity Control
- Authors: Huijuan Xu, Ximeng Sun, Eric Tzeng, Abir Das, Kate Saenko, Trevor
Darrell
- Abstract summary: We present a framework for few-shot temporal activity detection based on proposal regression.
Our model is end-to-end trainable, takes into account the frame rate differences between few-shot activities and untrimmed test videos, and can benefit from additional few-shot examples.
- Score: 107.79338380065286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many interesting events in the real world are rare making preannotated
machine learning ready videos a rarity in consequence. Thus, temporal activity
detection models that are able to learn from a few examples are desirable. In
this paper, we present a conceptually simple and general yet novel framework
for few-shot temporal activity detection based on proposal regression which
detects the start and end time of the activities in untrimmed videos. Our model
is end-to-end trainable, takes into account the frame rate differences between
few-shot activities and untrimmed test videos, and can benefit from additional
few-shot examples. We experiment on three large scale benchmarks for temporal
activity detection (ActivityNet1.2, ActivityNet1.3 and THUMOS14 datasets) in a
few-shot setting. We also study the effect on performance of different amount
of overlap with activities used to pretrain the video classification backbone
and propose corrective measures for future works in this domain. Our code will
be made available.
Related papers
- Query by Activity Video in the Wild [52.42177539947216]
In current query-by-activity-video literature, a common assumption is that all activities have sufficient labelled examples when learning an embedding.
We propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval.
arXiv Detail & Related papers (2023-11-23T10:26:36Z) - Self-supervised Pretraining with Classification Labels for Temporal
Activity Detection [54.366236719520565]
Temporal Activity Detection aims to predict activity classes per frame.
Due to the expensive frame-level annotations required for detection, the scale of detection datasets is limited.
This work proposes a novel self-supervised pretraining method for detection leveraging classification labels.
arXiv Detail & Related papers (2021-11-26T18:59:28Z) - Few-Shot Action Localization without Knowing Boundaries [9.959844922120523]
We show that it is possible to learn to localize actions in untrimmed videos when only one/few trimmed examples of the target action are available at test time.
We propose a network that learns to estimate Temporal Similarity Matrices (TSMs) that model a fine-grained similarity pattern between pairs of videos.
Our method achieves performance comparable or better to state-of-the-art fully-supervised, few-shot learning methods.
arXiv Detail & Related papers (2021-06-08T07:32:43Z) - Learning to Localize Actions from Moments [153.54638582696128]
We introduce a new design of transfer learning type to learn action localization for a large set of action categories.
We present Action Herald Networks (AherNet) that integrate such design into an one-stage action localization framework.
arXiv Detail & Related papers (2020-08-31T16:03:47Z) - Gabriella: An Online System for Real-Time Activity Detection in
Untrimmed Security Videos [72.50607929306058]
We propose a real-time online system to perform activity detection on untrimmed security videos.
The proposed method consists of three stages: tubelet extraction, activity classification and online tubelet merging.
We demonstrate the effectiveness of the proposed approach in terms of speed (100 fps) and performance with state-of-the-art results.
arXiv Detail & Related papers (2020-04-23T22:20:10Z) - TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition [10.07962673311661]
We present Aware Temporal Embedding Network (TAEN) for few-shot action recognition.
We demonstrate the effectiveness of TAEN on two few shot tasks, video classification and temporal action detection.
With training of just a few fully connected layers we reach comparable results to prior art on both few shot video classification and temporal detection tasks.
arXiv Detail & Related papers (2020-04-21T16:32:10Z) - ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected.
We design an end-to-end deep network based on R-C3D as the architecture for this solution.
Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.