Constructing Stronger and Faster Baselines for Skeleton-based Action
Recognition
- URL: http://arxiv.org/abs/2106.15125v1
- Date: Tue, 29 Jun 2021 07:09:11 GMT
- Title: Constructing Stronger and Faster Baselines for Skeleton-based Action
Recognition
- Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang
- Abstract summary: We present an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition.
On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other State-Of-The-Art (SOTA) methods.
- Score: 19.905455701387194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One essential problem in skeleton-based action recognition is how to extract
discriminative features over all skeleton joints. However, the complexity of
the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly
sophisticated and over-parameterized. The low efficiency in model training and
inference has increased the validation costs of model architectures in
large-scale datasets. To address the above issue, recent advanced separable
convolutional layers are embedded into an early fused Multiple Input Branches
(MIB) network, constructing an efficient Graph Convolutional Network (GCN)
baseline for skeleton-based action recognition. In addition, based on such the
baseline, we design a compound scaling strategy to expand the model's width and
depth synchronously, and eventually obtain a family of efficient GCN baselines
with high accuracies and small amounts of trainable parameters, termed
EfficientGCN-Bx, where ''x'' denotes the scaling coefficient. On two
large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4
baseline outperforms other SOTA methods, e.g., achieving 91.7% accuracy on the
cross-subject benchmark of NTU 60 dataset, while being 3.15x smaller and 3.21x
faster than MS-G3D, which is one of the best SOTA methods. The source code in
PyTorch version and the pretrained models are available at
https://github.com/yfsong0709/EfficientGCNv1.
Related papers
- Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - DRGCN: Dynamic Evolving Initial Residual for Deep Graph Convolutional
Networks [19.483662490506646]
We propose a novel model called Dynamic evolving initial Residual Graph Convolutional Network (DRGCN)
Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs.
Our model reaches new SOTA results on the large-scale ogbn-arxiv dataset of Open Graph Benchmark (OGB)
arXiv Detail & Related papers (2023-02-10T06:57:12Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices [3.591566487849146]
Binary neural networks (BNNs) tackle the issue with extreme compression and speed-up gains compared to real-valued models.
We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy.
Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model.
arXiv Detail & Related papers (2022-06-17T22:11:11Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z) - Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action
Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data.
We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry.
We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.