Spatio-Temporal Human Action Recognition Modelwith Flexible-interval
Sampling and Normalization
- URL: http://arxiv.org/abs/2108.05633v1
- Date: Thu, 12 Aug 2021 10:02:20 GMT
- Title: Spatio-Temporal Human Action Recognition Modelwith Flexible-interval
Sampling and Normalization
- Authors: Yuke, Yang
- Abstract summary: We propose a human action system for Red-Green-Blue(RGB) video input with our own designed module.
We build a novel dataset with a similar background and discriminative actions for both human keypoint prediction and behavior recognition.
Experimental results demonstrate the effectiveness of the proposed model on our own human behavior recognition dataset and some public datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human action recognition is a well-known computer vision and pattern
recognition task of identifying which action a man is actually doing.
Extracting the keypoint information of a single human with both spatial and
temporal features of action sequences plays an essential role to accomplish the
task.In this paper, we propose a human action system for Red-Green-Blue(RGB)
input video with our own designed module. Based on the efficient Gated
Recurrent Unit(GRU) for spatio-temporal feature extraction, we add another
sampling module and normalization module to improve the performance of the
model in order to recognize the human actions. Furthermore, we build a novel
dataset with a similar background and discriminative actions for both human
keypoint prediction and behavior recognition. To get a better result, we
retrain the pose model with our new dataset to get better performance.
Experimental results demonstrate the effectiveness of the proposed model on our
own human behavior recognition dataset and some public datasets.
Related papers
- Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network [2.223052975765005]
We propose a novel Pyramid Graph Convolutional Network (PGCN) to automatically recognize human-object interaction.
The system represents the 2D or 3D spatial relation of human and objects from the detection results in video data as a graph.
We evaluate our model on two challenging datasets in the field of human-object interaction recognition.
arXiv Detail & Related papers (2024-10-10T13:39:17Z) - Texture-Based Input Feature Selection for Action Recognition [3.9596068699962323]
We propose a novel method to determine the task-irrelevant content in inputs which increases the domain discrepancy.
We show that our proposed model is superior to existing models for action recognition on the HMDB-51 dataset and the Penn Action dataset.
arXiv Detail & Related papers (2023-02-28T23:56:31Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Revisiting spatio-temporal layouts for compositional action recognition [63.04778884595353]
We take an object-centric approach to action recognition.
The main focus of this paper is compositional/few-shot action recognition.
We demonstrate how to improve the performance of appearance-based models by fusion with layout-based models.
arXiv Detail & Related papers (2021-11-02T23:04:39Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Pose And Joint-Aware Action Recognition [87.4780883700755]
We present a new model for joint-based action recognition, which first extracts motion features from each joint separately through a shared motion encoder.
Our joint selector module re-weights the joint information to select the most discriminative joints for the task.
We show large improvements over the current state-of-the-art joint-based approaches on JHMDB, HMDB, Charades, AVA action recognition datasets.
arXiv Detail & Related papers (2020-10-16T04:43:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.