AdaSGN: Adapting Joint Number and Model Size for Efficient
Skeleton-Based Action Recognition
- URL: http://arxiv.org/abs/2103.11770v1
- Date: Mon, 22 Mar 2021 12:36:39 GMT
- Title: AdaSGN: Adapting Joint Number and Model Size for Efficient
Skeleton-Based Action Recognition
- Authors: Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu
- Abstract summary: Existing methods for skeleton-based action recognition mainly focus on improving the recognition accuracy.
A novel approach, called AdaSGN, is proposed in this paper, which can reduce the computational cost of the inference process.
- Score: 45.6728814296272
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing methods for skeleton-based action recognition mainly focus on
improving the recognition accuracy, whereas the efficiency of the model is
rarely considered. Recently, there are some works trying to speed up the
skeleton modeling by designing light-weight modules. However, in addition to
the model size, the amount of the data involved in the calculation is also an
important factor for the running speed, especially for the skeleton data where
most of the joints are redundant or non-informative to identify a specific
skeleton. Besides, previous works usually employ one fix-sized model for all
the samples regardless of the difficulty of recognition, which wastes
computations for easy samples. To address these limitations, a novel approach,
called AdaSGN, is proposed in this paper, which can reduce the computational
cost of the inference process by adaptively controlling the input number of the
joints of the skeleton on-the-fly. Moreover, it can also adaptively select the
optimal model size for each sample to achieve a better trade-off between
accuracy and efficiency. We conduct extensive experiments on three challenging
datasets, namely, NTU-60, NTU-120 and SHREC, to verify the superiority of the
proposed approach, where AdaSGN achieves comparable or even higher performance
with much lower GFLOPs compared with the baseline method.
Related papers
- SkeletonX: Data-Efficient Skeleton-based Action Recognition via Cross-sample Feature Aggregation [34.65359766672547]
This paper studies one-shot and limited-scale learning settings to enable efficient adaptation with minimal data.
We present SkeletonX, a lightweight training pipeline that integrates seamlessly with existing GCN-based skeleton action recognizers.
It surpasses previous state-of-the-art methods in the one-shot setting, with only 1/10 of the parameters and much fewer FLOPs.
arXiv Detail & Related papers (2025-04-16T04:01:42Z) - USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation [24.90512145836643]
We introduce a Unified Skeleton-based Dense Representation Learning framework based on feature decorrelation.
We show that our approach significantly outperforms the current state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2024-12-12T12:20:27Z) - AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning [22.950914612765494]
Fine-tuning large language models (LLMs) has achieved remarkable performance across various natural language processing tasks.
Memory-efficient Zeroth-order (MeZO) methods attempt to fine-tune LLMs using only forward passes, thereby avoiding the need for a backpropagation graph.
We propose the Adaptive Zeroth-order-Train Adaption (AdaZeta) framework, specifically designed to improve the performance and convergence of the ZO methods.
arXiv Detail & Related papers (2024-06-26T04:33:13Z) - DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Model-agnostic Body Part Relevance Assessment for Pedestrian Detection [4.405053430046726]
We present a framework for using sampling-based explanation models in a computer vision context by body part relevance assessment for pedestrian detection.
We introduce a novel sampling-based method similar to KernelSHAP that shows more robustness for lower sampling sizes and, thus, is more efficient for explainability analyses on large-scale datasets.
arXiv Detail & Related papers (2023-11-27T10:10:25Z) - Stabilizing Subject Transfer in EEG Classification with Divergence
Estimation [17.924276728038304]
We propose several graphical models to describe an EEG classification task.
We identify statistical relationships that should hold true in an idealized training scenario.
We design regularization penalties to enforce these relationships in two stages.
arXiv Detail & Related papers (2023-10-12T23:06:52Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - Parallel and Limited Data Voice Conversion Using Stochastic Variational
Deep Kernel Learning [2.5782420501870296]
This paper proposes a voice conversion method that works with limited data.
It is based on variational deep kernel learning (SVDKL)
It is possible to estimate non-smooth and more complex functions.
arXiv Detail & Related papers (2023-09-08T16:32:47Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Stronger, Faster and More Explainable: A Graph Convolutional Baseline
for Skeleton-based Action Recognition [22.90127409366107]
We propose an efficient but strong baseline based on Graph Convolutional Network (GCN)
Inspired by the success of the ResNet architecture in Convolutional Neural Network (CNN), a ResGCN module is introduced in GCN.
A PartAtt block is proposed to discover the most essential body parts over a whole action sequence.
arXiv Detail & Related papers (2020-10-20T02:56:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.