Related papers: Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition

Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition

URL: http://arxiv.org/abs/2106.15125v1
Date: Tue, 29 Jun 2021 07:09:11 GMT
Title: Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition
Authors: Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang
Abstract summary: We present an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition. On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other State-Of-The-Art (SOTA) methods.
Score: 19.905455701387194
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the recent State-Of-The-Art (SOTA) models for this task tends to be exceedingly sophisticated and over-parameterized. The low efficiency in model training and inference has increased the validation costs of model architectures in large-scale datasets. To address the above issue, recent advanced separable convolutional layers are embedded into an early fused Multiple Input Branches (MIB) network, constructing an efficient Graph Convolutional Network (GCN) baseline for skeleton-based action recognition. In addition, based on such the baseline, we design a compound scaling strategy to expand the model's width and depth synchronously, and eventually obtain a family of efficient GCN baselines with high accuracies and small amounts of trainable parameters, termed EfficientGCN-Bx, where ''x'' denotes the scaling coefficient. On two large-scale datasets, i.e., NTU RGB+D 60 and 120, the proposed EfficientGCN-B4 baseline outperforms other SOTA methods, e.g., achieving 91.7% accuracy on the cross-subject benchmark of NTU 60 dataset, while being 3.15x smaller and 3.21x faster than MS-G3D, which is one of the best SOTA methods. The source code in PyTorch version and the pretrained models are available at https://github.com/yfsong0709/EfficientGCNv1.

Related papers

Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
DRGCN: Dynamic Evolving Initial Residual for Deep Graph Convolutional Networks [19.483662490506646]
We propose a novel model called Dynamic evolving initial Residual Graph Convolutional Network (DRGCN) Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs. Our model reaches new SOTA results on the large-scale ogbn-arxiv dataset of Open Graph Benchmark (OGB)
arXiv Detail & Related papers (2023-02-10T06:57:12Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices [3.591566487849146]
Binary neural networks (BNNs) tackle the issue with extreme compression and speed-up gains compared to real-valued models. We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy. Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model.
arXiv Detail & Related papers (2022-06-17T22:11:11Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures. Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action. We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z)
Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN) Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN. A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z)
Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data. We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry. We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.