Related papers: Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition

Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition

URL: http://arxiv.org/abs/2010.07367v2
Date: Mon, 18 Jan 2021 16:15:31 GMT
Title: Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition
Authors: Shijie Li, Jinhui Yi, Yazan Abu Farha and Juergen Gall
Abstract summary: We propose a highly efficient graph convolutional network for action recognition. Our network requires 86%-93% less parameters and reduces the floating point operations by 89%-96%. It provides a much better trade-off between accuracy, memory footprint and processing time, which makes it suitable for robotics applications.
Score: 21.720764076798904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the advances in capturing 2D or 3D skeleton data, skeleton-based action recognition has received an increasing interest over the last years. As skeleton data is commonly represented by graphs, graph convolutional networks have been proposed for this task. While current graph convolutional networks accurately recognize actions, they are too expensive for robotics applications where limited computational resources are available. In this paper, we therefore propose a highly efficient graph convolutional network that addresses the limitations of previous works. This is achieved by a parallel structure that gradually fuses motion and spatial information and by reducing the temporal resolution as early as possible. Furthermore, we explicitly address the issue that human poses can contain errors. To this end, the network first refines the poses before they are further processed to recognize the action. We therefore call the network Pose Refinement Graph Convolutional Network. Compared to other graph convolutional networks, our network requires 86\%-93\% less parameters and reduces the floating point operations by 89%-96% while achieving a comparable accuracy. It therefore provides a much better trade-off between accuracy, memory footprint and processing time, which makes it suitable for robotics applications.

Related papers

Multi-Graph Convolution Network for Pose Forecasting [0.8057006406834467]
We propose a novel approach called the multi-graph convolution network (MGCN) for 3D human pose forecasting. MGCN simultaneously captures spatial and temporal information by introducing an augmented graph for pose sequences. In our evaluation, MGCN outperforms the state-of-the-art in pose prediction.
arXiv Detail & Related papers (2023-04-11T03:59:43Z)
Pushing the Limits of Asynchronous Graph-based Object Detection with Event Cameras [62.70541164894224]
We introduce several architecture choices which allow us to scale the depth and complexity of such models while maintaining low computation. Our method runs 3.7 times faster than a dense graph neural network, taking only 8.4 ms per forward pass.
arXiv Detail & Related papers (2022-11-22T15:14:20Z)
MultiScale MeshGraphNets [65.26373813797409]
We propose two complementary approaches to improve the framework from MeshGraphNets. First, we demonstrate that it is possible to learn accurate surrogate dynamics of a high-resolution system on a much coarser mesh. Second, we introduce a hierarchical approach (MultiScale MeshGraphNets) which passes messages on two different resolutions.
arXiv Detail & Related papers (2022-10-02T20:16:20Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras. Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement. In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z)
3D Human Pose Regression using Graph Convolutional Network [68.8204255655161]
We propose a graph convolutional network named PoseGraphNet for 3D human pose regression from 2D poses. Our model's performance is close to the state-of-the-art, but with much fewer parameters.
arXiv Detail & Related papers (2021-05-21T14:41:31Z)
ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs. We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural Networks [8.460826851547294]
efficient graph analysis using modern machine learning is receiving a growing level of attention. Deep learning approaches often operate over the entire adjacency matrix. It is desirable to identify efficient measures to reduce both run-time and memory requirements.
arXiv Detail & Related papers (2020-10-23T19:47:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.