Part-aware Prototypical Graph Network for One-shot Skeleton-based Action
Recognition
- URL: http://arxiv.org/abs/2208.09150v1
- Date: Fri, 19 Aug 2022 04:54:56 GMT
- Title: Part-aware Prototypical Graph Network for One-shot Skeleton-based Action
Recognition
- Authors: Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang
Hu, Errui Ding, Yu Guan, Xuming He
- Abstract summary: One-shot skeleton-based action recognition poses unique challenges in learning transferable representation from base classes to novel classes.
We propose a part-aware prototypical representation for one-shot skeleton-based action recognition.
We demonstrate the effectiveness of our method on two public skeleton-based action recognition datasets.
- Score: 57.86960990337986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of one-shot skeleton-based action
recognition, which poses unique challenges in learning transferable
representation from base classes to novel classes, particularly for
fine-grained actions. Existing meta-learning frameworks typically rely on the
body-level representations in spatial dimension, which limits the
generalisation to capture subtle visual differences in the fine-grained label
space. To overcome the above limitation, we propose a part-aware prototypical
representation for one-shot skeleton-based action recognition. Our method
captures skeleton motion patterns at two distinctive spatial levels, one for
global contexts among all body joints, referred to as body level, and the other
attends to local spatial regions of body parts, referred to as the part level.
We also devise a class-agnostic attention mechanism to highlight important
parts for each action class. Specifically, we develop a part-aware prototypical
graph network consisting of three modules: a cascaded embedding module for our
dual-level modelling, an attention-based part fusion module to fuse parts and
generate part-aware prototypes, and a matching module to perform classification
with the part-aware representations. We demonstrate the effectiveness of our
method on two public skeleton-based action recognition datasets: NTU RGB+D 120
and NW-UCLA.
Related papers
- Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning [13.68867780184022]
Few-shot learning aims to recognize new concepts using a limited number of visual samples.
Our framework incorporates both the abstract class semantics and the concrete class entities extracted from Large Language Models (LLMs)
For the challenging one-shot setting, our approach, utilizing the ResNet-12 backbone, achieves an average improvement of 1.95% over the second-best competitor.
arXiv Detail & Related papers (2024-08-22T15:10:20Z) - Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition [57.97930719585095]
We introduce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales.
Our approach is evaluated on various skeleton/language backbones and three large-scale datasets.
The results showcase the universality and superior performance of PURLS, surpassing prior skeleton-based solutions and standard baselines from other domains.
arXiv Detail & Related papers (2024-06-19T08:22:32Z) - Fine-Grained Side Information Guided Dual-Prompts for Zero-Shot Skeleton Action Recognition [18.012159340628557]
We propose a novel method via Side information and dual-prompts learning for skeleton-based zero-shot action recognition (STAR) at the fine-grained level.
Our method achieves state-of-the-art performance in ZSL and GZSL settings on datasets.
arXiv Detail & Related papers (2024-04-11T05:51:06Z) - Multi-Semantic Fusion Model for Generalized Zero-Shot Skeleton-Based
Action Recognition [32.291333054680855]
Generalized zero-shot skeleton-based action recognition (GZSSAR) is a new challenging problem in computer vision community.
We propose a multi-semantic fusion (MSF) model for improving the performance of GZSSAR.
arXiv Detail & Related papers (2023-09-18T09:00:25Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework [108.70949305791201]
Part-level Action Parsing (PAP) aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.
In particular, our framework first predicts the video-level class of the input video, then localizes the body parts and predicts the part-level action.
Our framework achieves state-of-the-art performance and outperforms existing methods over a 31.10% ROC score.
arXiv Detail & Related papers (2022-03-09T01:30:57Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.