Structure-Aware Human-Action Generation
- URL: http://arxiv.org/abs/2007.01971v3
- Date: Sun, 16 Aug 2020 20:05:43 GMT
- Title: Structure-Aware Human-Action Generation
- Authors: Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen
- Abstract summary: Graph convolutional networks (GCNs) are promising way to leverage structure information to learn structure representations.
We propose a variant of GCNs to leverage the powerful self-attention mechanism to adaptively sparsify a complete action graph in the temporal space.
Our method could dynamically attend to important past frames and construct a sparse graph to apply in the GCN framework, well-capturing the structure information in action sequences.
- Score: 126.05874420893092
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Generating long-range skeleton-based human actions has been a challenging
problem since small deviations of one frame can cause a malformed action
sequence. Most existing methods borrow ideas from video generation, which
naively treat skeleton nodes/joints as pixels of images without considering the
rich inter-frame and intra-frame structure information, leading to potential
distorted actions. Graph convolutional networks (GCNs) is a promising way to
leverage structure information to learn structure representations. However,
directly adopting GCNs to tackle such continuous action sequences both in
spatial and temporal spaces is challenging as the action graph could be huge.
To overcome this issue, we propose a variant of GCNs to leverage the powerful
self-attention mechanism to adaptively sparsify a complete action graph in the
temporal space. Our method could dynamically attend to important past frames
and construct a sparse graph to apply in the GCN framework, well-capturing the
structure information in action sequences. Extensive experimental results
demonstrate the superiority of our method on two standard human action datasets
compared with existing methods.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Dynamic Dense Graph Convolutional Network for Skeleton-based Human
Motion Prediction [14.825185477750479]
This paper presents a Dynamic Dense Graph Convolutional Network (DD-GCN) which constructs a dense graph and implements an integrated dynamic message passing.
Based on the dense graph, we propose a dynamic message passing framework that learns dynamically from data to generate distinctive messages.
Experiments on benchmark Human 3.6M and CMU Mocap datasets verify the effectiveness of our DD-GCN.
arXiv Detail & Related papers (2023-11-29T07:25:49Z) - Local-Global Information Interaction Debiasing for Dynamic Scene Graph
Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information.
Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z) - Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
Recognition through Redefined Skeletal Topology Awareness [24.83836008577395]
Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition.
They tend to optimize the adjacency matrix jointly with the model weights.
This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map.
We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
arXiv Detail & Related papers (2023-05-19T06:40:12Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition.
We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors.
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z) - Towards Unsupervised Deep Graph Structure Learning [67.58720734177325]
We propose an unsupervised graph structure learning paradigm, where the learned graph topology is optimized by data itself without any external guidance.
Specifically, we generate a learning target from the original data as an "anchor graph", and use a contrastive loss to maximize the agreement between the anchor graph and the learned graph.
arXiv Detail & Related papers (2022-01-17T11:57:29Z) - Multi Scale Temporal Graph Networks For Skeleton-based Action
Recognition [5.970574258839858]
Graph convolutional networks (GCNs) can effectively capture the features of related nodes and improve the performance of the model.
Existing methods based on GCNs have two problems. First, the consistency of temporal and spatial features is ignored for extracting features node by node and frame by frame.
We propose a novel model called Temporal Graph Networks (TGN) for action recognition.
arXiv Detail & Related papers (2020-12-05T08:08:25Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.