Delving into 3D Action Anticipation from Streaming Videos
- URL: http://arxiv.org/abs/1906.06521v2
- Date: Thu, 15 Jun 2023 00:09:45 GMT
- Title: Delving into 3D Action Anticipation from Streaming Videos
- Authors: Hongsong Wang and Jiashi Feng
- Abstract summary: Action anticipation aims to recognize the action with a partial observation.
We introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification.
We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label.
- Score: 99.0155538452263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Action anticipation, which aims to recognize the action with a partial
observation, becomes increasingly popular due to a wide range of applications.
In this paper, we investigate the problem of 3D action anticipation from
streaming videos with the target of understanding best practices for solving
this problem. We first introduce several complementary evaluation metrics and
present a basic model based on frame-wise action classification. To achieve
better performance, we then investigate two important factors, i.e., the length
of the training clip and clip sampling method. We also explore multi-task
learning strategies by incorporating auxiliary information from two aspects:
the full action representation and the class-agnostic action label. Our
comprehensive experiments uncover the best practices for 3D action
anticipation, and accordingly we propose a novel method with a multi-task loss.
The proposed method considerably outperforms the recent methods and exhibits
the state-of-the-art performance on standard benchmarks.
Related papers
- Semi-supervised Active Learning for Video Action Detection [8.110693267550346]
We develop a novel semi-supervised active learning approach which utilizes both labeled as well as unlabeled data.
We evaluate the proposed approach on three different benchmark datasets, UCF-24-101, JHMDB-21, and Youtube-VOS.
arXiv Detail & Related papers (2023-12-12T11:13:17Z) - Early Action Recognition with Action Prototypes [62.826125870298306]
We propose a novel model that learns a prototypical representation of the full action for each class.
We decompose the video into short clips, where a visual encoder extracts features from each clip independently.
Later, a decoder aggregates together in an online fashion features from all the clips for the final class prediction.
arXiv Detail & Related papers (2023-12-11T18:31:13Z) - A baseline on continual learning methods for video action recognition [15.157938674002793]
Continual learning aims to solve long-standing limitations of classic supervisedly-trained models.
We present a benchmark of state-of-the-art continual learning methods on video action recognition.
arXiv Detail & Related papers (2023-04-20T14:20:43Z) - Weakly Supervised Two-Stage Training Scheme for Deep Video Fight
Detection Model [0.0]
Fight detection in videos is an emerging deep learning application with today's prevalence of surveillance systems and streaming media.
Previous work has largely relied on action recognition techniques to tackle this problem.
We design the fight detection model as a composition of an action-aware feature extractor and an anomaly score generator.
arXiv Detail & Related papers (2022-09-23T08:29:16Z) - SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video
Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method.
We modernize the 3D convolutional backbone by introducing multi-head self-attention modules.
In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z) - A Training Method For VideoPose3D With Ideology of Action Recognition [0.9949781365631559]
This research shows a faster and more flexible training method for VideoPose3D based on action recognition.
It can handle both action-oriented and common pose-estimation problems.
arXiv Detail & Related papers (2022-06-13T19:25:27Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z) - On Evaluating Weakly Supervised Action Segmentation Methods [79.42955857919497]
We focus on two aspects of the use and evaluation of weakly supervised action segmentation approaches.
We train each method on the Breakfast dataset 5 times and provide average and standard deviation of the results.
Our experiments show that the standard deviation over these repetitions is between 1 and 2.5% and significantly affects the comparison between different approaches.
arXiv Detail & Related papers (2020-05-19T20:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.