Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition
- URL: http://arxiv.org/abs/2305.12398v1
- Date: Sun, 21 May 2023 08:29:16 GMT
- Title: Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition
- Authors: Haojun Xu, Yan Gao, Zheng Hui, Jie Li, and Xinbo Gao
- Abstract summary: How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
- Score: 71.35205097460124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How humans understand and recognize the actions of others is a complex
neuroscientific problem that involves a combination of cognitive mechanisms and
neural networks. Research has shown that humans have brain areas that recognize
actions that process top-down attentional information, such as the
temporoparietal association area. Also, humans have brain regions dedicated to
understanding the minds of others and analyzing their intentions, such as the
medial prefrontal cortex of the temporal lobe. Skeleton-based action
recognition creates mappings for the complex connections between the human
skeleton movement patterns and behaviors. Although existing studies encoded
meaningful node relationships and synthesized action representations for
classification with good results, few of them considered incorporating a priori
knowledge to aid potential representation learning for better performance.
LA-GCN proposes a graph convolution network using large-scale language models
(LLM) knowledge assistance. First, the LLM knowledge is mapped into a priori
global relationship (GPR) topology and a priori category relationship (CPR)
topology between nodes. The GPR guides the generation of new "bone"
representations, aiming to emphasize essential node information from the data
level. The CPR mapping simulates category prior knowledge in human brain
regions, encoded by the PC-AC module and used to add additional
supervision-forcing the model to learn class-distinguishable features. In
addition, to improve information transfer efficiency in topology modeling, we
propose multi-hop attention graph convolution. It aggregates each node's
k-order neighbor simultaneously to speed up model convergence. LA-GCN reaches
state-of-the-art on NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.
Related papers
- Skeleton-Based Action Recognition with Spatial-Structural Graph Convolution [0.7373617024876725]
We study the representation of skeleton data and the issue of over-smoothing in Graph Convolutional Network (GCN) based method.
We propose a two-stream graph convolution method called Spatial- Structural GCN (SpSt-GCN)
We evaluate our method on two large-scale datasets, i.e., NTU RGB+D and NTU RGB+D 120.
arXiv Detail & Related papers (2024-07-31T11:04:41Z) - Knowledge-Guided Prompt Learning for Lifespan Brain MR Image Segmentation [53.70131202548981]
We present a two-step segmentation framework employing Knowledge-Guided Prompt Learning (KGPL) for brain MRI.
Specifically, we first pre-train segmentation models on large-scale datasets with sub-optimal labels.
The introduction of knowledge-wise prompts captures semantic relationships between anatomical variability and biological processes.
arXiv Detail & Related papers (2024-07-31T04:32:43Z) - Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning.
The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z) - DBGDGM: Dynamic Brain Graph Deep Generative Model [63.23390833353625]
Graphs are a natural representation of brain activity derived from functional magnetic imaging (fMRI) data.
It is well known that clusters of anatomical brain regions, known as functional connectivity networks (FCNs), encode temporal relationships which can serve as useful biomarkers for understanding brain function and dysfunction.
Previous works, however, ignore the temporal dynamics of the brain and focus on static graphs.
We propose a dynamic brain graph deep generative model (DBGDGM) which simultaneously clusters brain regions into temporally evolving communities and learns dynamic unsupervised node embeddings.
arXiv Detail & Related papers (2023-01-26T20:45:30Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - Learning to Model the Relationship Between Brain Structural and
Functional Connectomes [16.096428756895918]
We develop a graph representation learning framework to model the relationship between brainobjective connectivity (SC) and functional connectivity (FC)
A trainable graph convolutional encoder captures interactions between brain regions-of-interest that mimic actual neural communications.
Experiments demonstrate that the learnt representations capture valuable information from the intrinsic properties of the subject's brain networks.
arXiv Detail & Related papers (2021-12-18T11:23:55Z) - Learning Dynamic Graph Representation of Brain Connectome with
Spatio-Temporal Attention [33.049423523704824]
We propose STAGIN, a method for learning dynamic graph representation of the brain connectome with temporal attention.
Experiments on the HCP-Rest and the HCP-Task datasets demonstrate exceptional performance of our proposed method.
arXiv Detail & Related papers (2021-05-27T23:06:50Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Spatio-Temporal Graph Convolution for Resting-State fMRI Analysis [11.85489505372321]
We train a-temporal graph convolutional network (ST-GCN) on short sub-sequences of the BOLD time series to model the non-stationary nature of functional connectivity.
St-GCN is significantly more accurate than common approaches in predicting gender and age based on BOLD signals.
arXiv Detail & Related papers (2020-03-24T01:56:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.