Multi-Dimensional Refinement Graph Convolutional Network with Robust
Decouple Loss for Fine-Grained Skeleton-Based Action Recognition
- URL: http://arxiv.org/abs/2306.15321v1
- Date: Tue, 27 Jun 2023 09:23:36 GMT
- Title: Multi-Dimensional Refinement Graph Convolutional Network with Robust
Decouple Loss for Fine-Grained Skeleton-Based Action Recognition
- Authors: Sheng-Lan Liu, Yu-Ning Ding, Jin-Rong Zhang, Kai-Yuan Liu, Si-Fan
Zhang, Fei-Long Wang, and Gao Huang
- Abstract summary: We propose a flexible attention block called Channel-Variable Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of spatial-temporal joints.
Based on CVSTA, we construct a Multi-Dimensional Refinement Graph Convolutional Network (MDR-GCN), which can improve the discrimination among channel-, joint- and frame-level features.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly boosts the effect of the CVSTA and reduces the impact of noise.
- Score: 19.031036881780107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph convolutional networks have been widely used in skeleton-based action
recognition. However, existing approaches are limited in fine-grained action
recognition due to the similarity of inter-class data. Moreover, the noisy data
from pose extraction increases the challenge of fine-grained recognition. In
this work, we propose a flexible attention block called Channel-Variable
Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of
spatial-temporal joints and obtain a more compact intra-class feature
distribution. Based on CVSTA, we construct a Multi-Dimensional Refinement Graph
Convolutional Network (MDR-GCN), which can improve the discrimination among
channel-, joint- and frame-level features for fine-grained actions.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly
boosts the effect of the CVSTA and reduces the impact of noise. The proposed
method combining MDR-GCN with RDL outperforms the known state-of-the-art
skeleton-based approaches on fine-grained datasets, FineGym99 and FSD-10, and
also on the coarse dataset NTU-RGB+D X-view version.
Related papers
- DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - Improved Dual Correlation Reduction Network [40.792587861237166]
We propose a novel deep graph clustering algorithm termed Improved Dual Correlation Reduction Network (IDCRN)
By approximating the cross-view feature correlation matrix to an identity matrix, we reduce the redundancy between different dimensions of features.
We also avoid the collapsed representation caused by the over-smoothing issue in Graph Convolutional Networks (GCNs) through an introduced propagation regularization term.
arXiv Detail & Related papers (2022-02-25T07:48:32Z) - Deep Graph Clustering via Dual Correlation Reduction [37.973072977988494]
We propose a novel self-supervised deep graph clustering method termed Dual Correlation Reduction Network (DCRN)
In our method, we first design a siamese network to encode samples. Then by forcing the cross-view sample correlation matrix and cross-view feature correlation matrix to approximate two identity matrices, respectively, we reduce the information correlation in the dual-level.
In order to alleviate representation collapse caused by over-smoothing in GCN, we introduce a propagation regularization term to enable the network to gain long-distance information.
arXiv Detail & Related papers (2021-12-29T04:05:38Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Richly Activated Graph Convolutional Network for Robust Skeleton-based
Action Recognition [22.90127409366107]
A graph convolutional network (GCN) is proposed to explore sufficient discriminative features spreading over all skeleton joints.
The RA-GCN achieves comparable performance on the standard NTU RGB+D 60 and 120 datasets.
arXiv Detail & Related papers (2020-08-09T19:06:29Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.