Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition
- URL: http://arxiv.org/abs/2209.10073v1
- Date: Wed, 21 Sep 2022 02:33:07 GMT
- Title: Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition
- Authors: Anqi Zhu, Qiuhong Ke, Mingming Gong and James Bailey
- Abstract summary: We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
- Score: 54.23513799338309
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Skeleton-based action recognition receives increasing attention because the
skeleton representations reduce the amount of training data by eliminating
visual information irrelevant to actions. To further improve the sample
efficiency, meta-learning-based one-shot learning solutions were developed for
skeleton-based action recognition. These methods find the nearest neighbor
according to the similarity between instance-level global average embedding.
However, such measurement holds unstable representativity due to inadequate
generalized learning on local invariant and noisy features, while intuitively,
more fine-grained recognition usually relies on determining key local body
movements. To address this limitation, we present the Adaptive
Local-Component-aware Graph Convolutional Network, which replaces the
comparison metric with a focused sum of similarity measurements on aligned
local embedding of action-critical spatial/temporal segments. Comprehensive
one-shot experiments on the public benchmark of NTU-RGB+D 120 indicate that our
method provides a stronger representation than the global embedding and helps
our model reach state-of-the-art.
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map [23.71680014689873]
Class Activation Map (CAM) is widely used to interpret deep model predictions by highlighting object location.
This paper proposes a new two-stage interpretability method called the Decomposition Class Activation Map (Decom-CAM)
Our experiments demonstrate that the proposed Decom-CAM outperforms current state-of-the-art methods significantly.
arXiv Detail & Related papers (2023-05-27T14:33:01Z) - Part Aware Contrastive Learning for Self-Supervised Action Recognition [18.423841093299135]
This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR.
Our proposed SkeAttnCLR outperforms state-of-the-art methods on NTURGB+D, NTU120-RGB+D, and PKU-MMD datasets.
arXiv Detail & Related papers (2023-05-01T05:31:48Z) - FV-UPatches: Enhancing Universality in Finger Vein Recognition [0.6299766708197883]
We propose a universal learning-based framework, which achieves generalization while training with limited data.
The proposed framework shows application potential in other vein-based biometric recognition as well.
arXiv Detail & Related papers (2022-06-02T14:20:22Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - DenseGAP: Graph-Structured Dense Correspondence Learning with Anchor
Points [15.953570826460869]
Establishing dense correspondence between two images is a fundamental computer vision problem.
We introduce DenseGAP, a new solution for efficient Dense correspondence learning with a Graph-structured neural network conditioned on Anchor Points.
Our method advances the state-of-the-art of correspondence learning on most benchmarks.
arXiv Detail & Related papers (2021-12-13T18:59:30Z) - Video Salient Object Detection via Adaptive Local-Global Refinement [7.723369608197167]
Video salient object detection (VSOD) is an important task in many vision applications.
We propose an adaptive local-global refinement framework for VSOD.
We show that our weighting methodology can further exploit the feature correlations, thus driving the network to learn more discriminative feature representation.
arXiv Detail & Related papers (2021-04-29T14:14:11Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.