Related papers: A Comprehensive Review of Few-shot Action Recognition

A Comprehensive Review of Few-shot Action Recognition

URL: http://arxiv.org/abs/2407.14744v1
Date: Sat, 20 Jul 2024 03:53:32 GMT
Title: A Comprehensive Review of Few-shot Action Recognition
Authors: Yuyang Wanyan, Xiaoshan Yang, Weiming Dong, Changsheng Xu,
Abstract summary: Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data. It requires accurately classifying human actions in videos using only a few labeled examples per class.
Score: 64.47305887411275
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data in action recognition. It requires accurately classifying human actions in videos using only a few labeled examples per class. Compared to few-shot learning in image scenarios, few-shot action recognition is more challenging due to the intrinsic complexity of video data. Recognizing actions involves modeling intricate temporal sequences and extracting rich semantic information, which goes beyond mere human and object identification in each frame. Furthermore, the issue of intra-class variance becomes particularly pronounced with limited video samples, complicating the learning of representative features for novel action categories. To overcome these challenges, numerous approaches have driven significant advancements in few-shot action recognition, which underscores the need for a comprehensive survey. Unlike early surveys that focus on few-shot image or text classification, we deeply consider the unique challenges of few-shot action recognition. In this survey, we review a wide variety of recent methods and summarize the general framework. Additionally, the survey presents the commonly used benchmarks and discusses relevant advanced topics and promising future directions. We hope this survey can serve as a valuable resource for researchers, offering essential guidance to newcomers and stimulating seasoned researchers with fresh insights.

Related papers

About Time: Advances, Challenges, and Outlooks of Action Understanding [57.76390141287026]
This survey comprehensively reviews advances in uni- and multi-modal action understanding across a range of tasks. We focus on prevalent challenges, overview widely adopted datasets, and survey seminal works with an emphasis on recent advances.
arXiv Detail & Related papers (2024-11-22T18:09:27Z)
Classification Matters: Improving Video Action Detection with Class-Specific Attention [61.14469113965433]
Video action detection (VAD) aims to detect actors and classify their actions in a video. We analyze how prevailing methods form features for classification and find that they prioritize actor regions. We propose to reduce the bias toward actor and encourage paying attention to the context that is relevant to each action class.
arXiv Detail & Related papers (2024-07-29T04:43:58Z)
A Review of Machine Learning Methods Applied to Video Analysis Systems [3.518774226658318]
The paper provides a survey of the development of machine-learning techniques for video analysis. We provide summaries of the development of self-supervised learning, semi-supervised learning, active learning, and zero-shot learning for applications in video analysis.
arXiv Detail & Related papers (2023-12-08T20:24:03Z)
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition [12.043574473965318]
Human action recognition (HAR) in videos is one of the core tasks of video understanding. We propose a new method that simultaneously learns to recognize efficiently human actions in the infrared spectrum.
arXiv Detail & Related papers (2022-04-19T05:12:24Z)
Few-Shot Object Detection: A Survey [4.266990593059534]
Few-shot object detection aims to learn from few object instances of new categories in the target domain. We categorize approaches according to their training scheme and architectural layout. We introduce commonly used datasets and their evaluation protocols and analyze reported benchmark results.
arXiv Detail & Related papers (2021-12-22T07:08:53Z)
Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications. We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class. Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z)
A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications. Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z)
A Comprehensive Study of Deep Video Action Recognition [35.7068977497202]
Video action recognition is one of the representative tasks for video understanding. We provide a comprehensive survey of over 200 existing papers on deep learning for video action recognition.
arXiv Detail & Related papers (2020-12-11T18:54:08Z)
Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video. Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z)
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition [86.31412529187243]
Few-shot video recognition aims at learning new actions with only very few labeled samples. We propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net.
arXiv Detail & Related papers (2020-10-20T03:06:20Z)
A Grid-based Representation for Human Action Recognition [12.043574473965318]
Human action recognition (HAR) in videos is a fundamental research topic in computer vision. We propose a novel method for action recognition that encodes efficiently the most discriminative appearance information of an action. Our method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions.
arXiv Detail & Related papers (2020-10-17T18:25:00Z)
Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation [132.82884193921535]
We argue that previous methods underestimate the importance of video feature learning and propose a two-stage approach. We show that this simple baseline approach outperforms prior few-shot video classification methods by over 20 points on existing benchmarks. We present two novel approaches that yield further improvement.
arXiv Detail & Related papers (2020-07-09T13:05:32Z)
Intra- and Inter-Action Understanding via Temporal Action Parsing [118.32912239230272]
We construct a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top. Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition. We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them.
arXiv Detail & Related papers (2020-05-20T17:45:18Z)
TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition [10.07962673311661]
We present Aware Temporal Embedding Network (TAEN) for few-shot action recognition. We demonstrate the effectiveness of TAEN on two few shot tasks, video classification and temporal action detection. With training of just a few fully connected layers we reach comparable results to prior art on both few shot video classification and temporal detection tasks.
arXiv Detail & Related papers (2020-04-21T16:32:10Z)
Delving into 3D Action Anticipation from Streaming Videos [99.0155538452263]
Action anticipation aims to recognize the action with a partial observation. We introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification. We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label.
arXiv Detail & Related papers (2019-06-15T10:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.