Action Recognition with Domain Invariant Features of Skeleton Image
- URL: http://arxiv.org/abs/2111.11250v1
- Date: Fri, 19 Nov 2021 08:05:54 GMT
- Title: Action Recognition with Domain Invariant Features of Skeleton Image
- Authors: Han Chen and Yifan Jiang and Hanseok Ko
- Abstract summary: We propose a novel CNN-based method with adversarial training for action recognition.
We introduce a two-level domain adversarial learning to align the features of skeleton images from different view angles or subjects.
It achieves competitive results compared with state-of-the-art methods.
- Score: 25.519217340328442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the fast processing-speed and robustness it can achieve,
skeleton-based action recognition has recently received the attention of the
computer vision community. The recent Convolutional Neural Network (CNN)-based
methods have shown commendable performance in learning spatio-temporal
representations for skeleton sequence, which use skeleton image as input to a
CNN. Since the CNN-based methods mainly encoding the temporal and skeleton
joints simply as rows and columns, respectively, the latent correlation related
to all joints may be lost caused by the 2D convolution. To solve this problem,
we propose a novel CNN-based method with adversarial training for action
recognition. We introduce a two-level domain adversarial learning to align the
features of skeleton images from different view angles or subjects,
respectively, thus further improve the generalization. We evaluated our
proposed method on NTU RGB+D. It achieves competitive results compared with
state-of-the-art methods and 2.4$\%$, 1.9$\%$ accuracy gain than the baseline
for cross-subject and cross-view.
Related papers
- STEP CATFormer: Spatial-Temporal Effective Body-Part Cross Attention
Transformer for Skeleton-based Action Recognition [0.0]
We focus on how the Graph Convolutional Convolution networks learn different topologies and effectively aggregate joint features in the global temporal and local temporal.
We propose three Channel-wise Tolopogy Graph Convolution based on Channel-wise Topology Refinement Graph Convolution (CTR-GCN)
We develop a powerful graph convolutional network named Spatial Temporal Effective Body-part Cross Attention Transformer which notably high-performance on the NTU RGB+D, NTU RGB+D 120 datasets.
arXiv Detail & Related papers (2023-12-06T04:36:58Z) - Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition.
Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models.
Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Combining the Silhouette and Skeleton Data for Gait Recognition [13.345465199699]
Two dominant gait recognition works are appearance-based and model-based, which extract features from silhouettes and skeletons, respectively.
This paper proposes a CNN-based branch taking silhouettes as input and a GCN-based branch taking skeletons as input.
For better gait representation in the GCN-based branch, we present a fully connected graph convolution operator to integrate multi-scale graph convolutions.
arXiv Detail & Related papers (2022-02-22T03:21:51Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based
Action Recognition [49.163326827954656]
We propose a novel multi-granular-temporal graph network for skeleton-based action classification.
We develop a dual-head graph network consisting of two inter-leaved branches, which enables us to extract at least two-temporal resolutions.
We conduct extensive experiments on three large-scale datasets.
arXiv Detail & Related papers (2021-08-10T09:25:07Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - JOLO-GCN: Mining Joint-Centered Light-Weight Information for
Skeleton-Based Action Recognition [47.47099206295254]
We propose a novel framework for employing human pose skeleton and joint-centered light-weight information jointly in a two-stream graph convolutional network.
Compared to the pure skeleton-based baseline, this hybrid scheme effectively boosts performance, while keeping the computational and memory overheads low.
arXiv Detail & Related papers (2020-11-16T08:39:22Z) - Progressive Spatio-Temporal Graph Convolutional Network for
Skeleton-Based Human Action Recognition [97.14064057840089]
We propose a method to automatically find a compact and problem-specific network for graph convolutional networks in a progressive manner.
Experimental results on two datasets for skeleton-based human action recognition indicate that the proposed method has competitive or even better classification performance.
arXiv Detail & Related papers (2020-11-11T09:57:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.