Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction
- URL: http://arxiv.org/abs/2304.03532v2
- Date: Mon, 7 Aug 2023 07:25:34 GMT
- Title: Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction
- Authors: Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu
- Abstract summary: Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory.
Human-Mixer has been leveraged into human motion prediction as a promising alternative to GCNs.
By incorporating graph guidance, our textitGraph-Guided Mixer can effectively capture and utilize the specific connectivity patterns within human skeleton's graph representation.
- Score: 14.988322340164391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, Graph Convolutional Networks (GCNs) have been widely used in
human motion prediction, but their performance remains unsatisfactory.
Recently, MLP-Mixer, initially developed for vision tasks, has been leveraged
into human motion prediction as a promising alternative to GCNs, which achieves
both better performance and better efficiency than GCNs. Unlike GCNs, which can
explicitly capture human skeleton's bone-joint structure by representing it as
a graph with edges and nodes, MLP-Mixer relies on fully connected layers and
thus cannot explicitly model such graph-like structure of human's. To break
this limitation of MLP-Mixer's, we propose \textit{Graph-Guided Mixer}, a novel
approach that equips the original MLP-Mixer architecture with the capability to
model graph structure. By incorporating graph guidance, our
\textit{Graph-Guided Mixer} can effectively capture and utilize the specific
connectivity patterns within human skeleton's graph representation. In this
paper, first we uncover a theoretical connection between MLP-Mixer and GCN that
is unexplored in existing research. Building on this theoretical connection,
next we present our proposed \textit{Graph-Guided Mixer}, explaining how the
original MLP-Mixer architecture is reinvented to incorporate guidance from
graph structure. Then we conduct an extensive evaluation on the Human3.6M,
AMASS, and 3DPW datasets, which shows that our method achieves state-of-the-art
performance.
Related papers
- Graph Neural Machine: A New Model for Learning with Tabular Data [25.339493426758903]
Graph neural networks (GNNs) have recently become the standard tool for performing machine learning tasks on graphs.
In this work, we show that an representation is equivalent to an asynchronous message passing GNN model.
We then propose a new machine learning model for data, the so-called Graph Neural Machine (GNM)
arXiv Detail & Related papers (2024-02-05T10:22:15Z) - Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - M3C: A Framework towards Convergent, Flexible, and Unsupervised Learning
of Mixture Graph Matching and Clustering [57.947071423091415]
We introduce Minorize-Maximization Matching and Clustering (M3C), a learning-free algorithm that guarantees theoretical convergence.
We develop UM3C, an unsupervised model that incorporates novel edge-wise affinity learning and pseudo label selection.
Our method outperforms state-of-the-art graph matching and mixture graph matching and clustering approaches in both accuracy and efficiency.
arXiv Detail & Related papers (2023-10-27T19:40:34Z) - Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation
from GNNs to MLPs [22.541655587228203]
Distilling high-accuracy Graph Neural Networks(GNNs) to low-latency multilayer perceptrons(MLPs) on graph tasks has become a hot research topic.
We propose a Prototype-Guided Knowledge Distillation(PGKD) method, which does not require graph edges(edge-free) yet learns structure-awares.
arXiv Detail & Related papers (2023-03-24T02:28:55Z) - Graph Mixer Networks [0.0]
We propose the Graph Mixer Network, also referred to as Graph Nasreddin Nets (GNasNets), a framework that incorporates the principles of the foundation-Mixers for graph-structured data.
Using a PNA model with multiple aggregators, our proposed GMN has demonstrated improved performance compared to Graph Transformers.
arXiv Detail & Related papers (2023-01-29T17:03:00Z) - A Generalization of ViT/MLP-Mixer to Graphs [32.86160915431453]
We introduce a new class of GNNs called Graph ViT/MLP-Mixer.
They capture long-range dependency and mitigate the issue of over-squashing.
They offer better speed and memory efficiency with a complexity linear to the number of nodes and edges.
arXiv Detail & Related papers (2022-12-27T03:27:46Z) - MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP
Initialization [51.76758674012744]
Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming.
We propose an embarrassingly simple, yet hugely effective method for GNN training acceleration, called PeerInit.
arXiv Detail & Related papers (2022-09-30T21:33:51Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation [68.65764751482774]
GraphMLP is a global-local-graphical unified architecture for 3D human pose estimation.
It incorporates the graph structure of human bodies into a model to meet the domain-specific demand of the 3D human pose.
It can be extended to model complex temporal dynamics in a simple way with negligible computational cost gains in the sequence length.
arXiv Detail & Related papers (2022-06-13T18:59:31Z) - MLP-Mixer: An all-MLP Architecture for Vision [93.16118698071993]
We present-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs).
Mixer attains competitive scores on image classification benchmarks, with pre-training and inference comparable to state-of-the-art models.
arXiv Detail & Related papers (2021-05-04T16:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.