Fusion-GCN: Multimodal Action Recognition using Graph Convolutional
Networks
- URL: http://arxiv.org/abs/2109.12946v1
- Date: Mon, 27 Sep 2021 10:52:33 GMT
- Title: Fusion-GCN: Multimodal Action Recognition using Graph Convolutional
Networks
- Authors: Michael Duhme, Raphael Memmesheimer, Dietrich Paulus
- Abstract summary: Fusion-GCN is an approach for multimodal action recognition using Graph Convolutional Networks (GCNs)
We integrate various sensor data modalities into a graph that is trained using a GCN model for multi-modal action recognition.
- Score: 0.5801044612920815
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we present Fusion-GCN, an approach for multimodal action
recognition using Graph Convolutional Networks (GCNs). Action recognition
methods based around GCNs recently yielded state-of-the-art performance for
skeleton-based action recognition. With Fusion-GCN, we propose to integrate
various sensor data modalities into a graph that is trained using a GCN model
for multi-modal action recognition. Additional sensor measurements are
incorporated into the graph representation, either on a channel dimension
(introducing additional node attributes) or spatial dimension (introducing new
nodes). Fusion-GCN was evaluated on two public available datasets, the
UTD-MHAD- and MMACT datasets, and demonstrates flexible fusion of RGB
sequences, inertial measurements and skeleton sequences. Our approach gets
comparable results on the UTD-MHAD dataset and improves the baseline on the
large-scale MMACT dataset by a significant margin of up to 12.37% (F1-Measure)
with the fusion of skeleton estimates and accelerometer measurements.
Related papers
- MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition [0.6442618560991484]
We propose an innovative Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation (MK-SGN) to address this issue.
By merging the energy efficiency of Spiking Neural Network (SNN) with the graph representation capability of GCN, the proposed MK-SGN reduces energy consumption while maintaining recognition accuracy.
arXiv Detail & Related papers (2024-04-16T01:41:22Z) - DGNN: Decoupled Graph Neural Networks with Structural Consistency
between Attribute and Graph Embedding Representations [62.04558318166396]
Graph neural networks (GNNs) demonstrate a robust capability for representation learning on graphs with complex structures.
A novel GNNs framework, dubbed Decoupled Graph Neural Networks (DGNN), is introduced to obtain a more comprehensive embedding representation of nodes.
Experimental results conducted on several graph benchmark datasets verify DGNN's superiority in node classification task.
arXiv Detail & Related papers (2024-01-28T06:43:13Z) - Interweaved Graph and Attention Network for 3D Human Pose Estimation [15.699524854176644]
We propose a novel Interweaved Graph and Attention Network (IGANet)
IGANet allows bidirectional communications between graph convolutional networks (GCNs) and attentions.
We introduce an IGA module, where attentions are provided with local information from GCNs and GCNs are injected with global information from attentions.
arXiv Detail & Related papers (2023-04-27T09:21:15Z) - Pose-Guided Graph Convolutional Networks for Skeleton-Based Action
Recognition [32.07659338674024]
Graph convolutional networks (GCNs) can model the human body skeletons as spatial and temporal graphs.
In this work, we propose pose-guided GCN (PG-GCN), a multi-modal framework for high-performance human action recognition.
The core idea of this module is to utilize a trainable graph to aggregate features from the skeleton stream with that of the pose stream, which leads to a network with more robust feature representation ability.
arXiv Detail & Related papers (2022-10-10T02:08:49Z) - Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN)
In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism.
By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z) - PR-GCN: A Deep Graph Convolutional Network with Point Refinement for 6D
Pose Estimation [24.06845422193827]
RGB-D based 6D pose estimation has recently achieved remarkable progress, but still suffers from two major limitations.
This paper proposes a novel deep learning approach, namely Graph Convolutional Network with Point Refinement (PR-GCN)
It first introduces the Point Refinement Network (PRN) to polish 3D point clouds, recovering missing parts with noise removed.
Subsequently, the Multi-Modal Fusion Graph Convolutional Network (MMF-GCN) is presented to strengthen RGB-D combination.
arXiv Detail & Related papers (2021-08-23T03:53:34Z) - Graph Convolutional Embeddings for Recommender Systems [67.5973695167534]
We propose a graph convolutional embedding layer for N-partite graphs that processes user-item-context interactions.
More specifically, we define a graph convolutional embedding layer for N-partite graphs that processes user-item-context interactions.
arXiv Detail & Related papers (2021-03-05T10:46:16Z) - Multi Scale Temporal Graph Networks For Skeleton-based Action
Recognition [5.970574258839858]
Graph convolutional networks (GCNs) can effectively capture the features of related nodes and improve the performance of the model.
Existing methods based on GCNs have two problems. First, the consistency of temporal and spatial features is ignored for extracting features node by node and frame by frame.
We propose a novel model called Temporal Graph Networks (TGN) for action recognition.
arXiv Detail & Related papers (2020-12-05T08:08:25Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action
Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data.
We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry.
We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.