Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition
- URL: http://arxiv.org/abs/2010.09978v1
- Date: Tue, 20 Oct 2020 02:56:58 GMT
- Title: Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition
- Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan and Liang Wang
- Abstract summary: We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
- Score: 22.90127409366107
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One essential problem in skeleton-based action recognition is how to extract
discriminative features over all skeleton joints. However, the complexity of
the State-Of-The-Art (SOTA) models of this task tends to be exceedingly
sophisticated and over-parameterized, where the low efficiency in model
training and inference has obstructed the development in the field, especially
for large-scale action datasets. In this work, we propose an efficient but
strong baseline based on Graph Convolutional Network (GCN), where three main
improvements are aggregated, i.e., early fused Multiple Input Branches (MIB),
Residual GCN (ResGCN) with bottleneck structure and Part-wise Attention
(PartAtt) block. Firstly, an MIB is designed to enrich informative skeleton
features and remain compact representations at an early fusion stage. Then,
inspired by the success of the ResNet architecture in Convolutional Neural
Network (CNN), a ResGCN module is introduced in GCN to alleviate computational
costs and reduce learning difficulties in model training while maintain the
model accuracy. Finally, a PartAtt block is proposed to discover the most
essential body parts over a whole action sequence and obtain more explainable
representations for different skeleton action sequences. Extensive experiments
on two large-scale datasets, i.e., NTU RGB+D 60 and 120, validate that the
proposed baseline slightly outperforms other SOTA models and meanwhile requires
much fewer parameters during training and inference procedures, e.g., at most
34 times less than DGNN, which is one of the best SOTA methods.
Related papers
- Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness [24.83836008577395]
Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition.
They tend to optimize the adjacency matrix jointly with the model weights.
This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map.
We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
arXiv Detail & Related papers (2023-05-19T06:40:12Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Multi-Scale Semantics-Guided Neural Networks for Efficient
Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition.
MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z) - Tackling Oversmoothing of GNNs with Contrastive Learning [35.88575306925201]
Graph neural networks (GNNs) integrate the comprehensive relation of graph data and representation learning capability.
Oversmoothing makes the final representations of nodes indiscriminative, thus deteriorating the node classification and link prediction performance.
We propose the Topology-guided Graph Contrastive Layer, named TGCL, which is the first de-oversmoothing method maintaining all three mentioned metrics.
arXiv Detail & Related papers (2021-10-26T15:56:16Z) - Constructing Stronger and Faster Baselines for Skeleton-based Action
Recognition [19.905455701387194]
We present an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition.
On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other State-Of-The-Art (SOTA) methods.
arXiv Detail & Related papers (2021-06-29T07:09:11Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.