Memory Group Sampling Based Online Action Recognition Using Kinetic
Skeleton Features
- URL: http://arxiv.org/abs/2011.00553v2
- Date: Tue, 3 Nov 2020 05:09:10 GMT
- Title: Memory Group Sampling Based Online Action Recognition Using Kinetic
Skeleton Features
- Authors: Guoliang Liu, Qinghui Zhang, Yichao Cao, Junwei Li, Hao Wu and Guohui
Tian
- Abstract summary: We propose two core ideas to handle the online action recognition problem.
First, we combine the spatial and temporal skeleton features to depict the actions.
Second, we propose a memory group sampling method to combine the previous action frames and current action frames.
Third, an improved 1D CNN network is employed for training and testing using the features from sampled frames.
- Score: 4.674689979981502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online action recognition is an important task for human centered intelligent
services, which is still difficult to achieve due to the varieties and
uncertainties of spatial and temporal scales of human actions. In this paper,
we propose two core ideas to handle the online action recognition problem.
First, we combine the spatial and temporal skeleton features to depict the
actions, which include not only the geometrical features, but also multi-scale
motion features, such that both the spatial and temporal information of the
action are covered. Second, we propose a memory group sampling method to
combine the previous action frames and current action frames, which is based on
the truth that the neighbouring frames are largely redundant, and the sampling
mechanism ensures that the long-term contextual information is also considered.
Finally, an improved 1D CNN network is employed for training and testing using
the features from sampled frames. The comparison results to the state of the
art methods using the public datasets show that the proposed method is fast and
efficient, and has competitive performance
Related papers
- Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Efficient Global-Local Memory for Real-time Instrument Segmentation of
Robotic Surgical Video [53.14186293442669]
We identify two important clues for surgical instrument perception, including local temporal dependency from adjacent frames and global semantic correlation in long-range duration.
We propose a novel dual-memory network (DMNet) to relate both global and local-temporal knowledge.
Our method largely outperforms the state-of-the-art works on segmentation accuracy while maintaining a real-time speed.
arXiv Detail & Related papers (2021-09-28T10:10:14Z) - Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based
Action Recognition [49.163326827954656]
We propose a novel multi-granular-temporal graph network for skeleton-based action classification.
We develop a dual-head graph network consisting of two inter-leaved branches, which enables us to extract at least two-temporal resolutions.
We conduct extensive experiments on three large-scale datasets.
arXiv Detail & Related papers (2021-08-10T09:25:07Z) - Modeling long-term interactions to enhance action recognition [81.09859029964323]
We propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels.
We use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects.
The proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks.
arXiv Detail & Related papers (2021-04-23T10:08:15Z) - Finding Action Tubes with a Sparse-to-Dense Framework [62.60742627484788]
We propose a framework that generates action tube proposals from video streams with a single forward pass in a sparse-to-dense manner.
We evaluate the efficacy of our model on the UCF101-24, JHMDB-21 and UCFSports benchmark datasets.
arXiv Detail & Related papers (2020-08-30T15:38:44Z) - Gesture Recognition from Skeleton Data for Intuitive Human-Machine
Interaction [0.6875312133832077]
We propose an approach for segmentation and classification of dynamic gestures based on a set of handcrafted features.
The method for gesture recognition applies a sliding window, which extracts information from both the spatial and temporal dimensions.
At the end, the recognized gestures are used to interact with a collaborative robot.
arXiv Detail & Related papers (2020-08-26T11:28:50Z) - Complex Human Action Recognition in Live Videos Using Hybrid FR-DL
Method [1.027974860479791]
We address challenges of the preprocessing phase, by an automated selection of representative frames among the input sequences.
We propose a hybrid technique using background subtraction and HOG, followed by application of a deep neural network and skeletal modelling method.
We name our model as Feature Reduction & Deep Learning based action recognition method, or FR-DL in short.
arXiv Detail & Related papers (2020-07-06T15:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.