Related papers: An Attractor-Guided Neural Networks for Skeleton-Based Human Motion Prediction

An Attractor-Guided Neural Networks for Skeleton-Based Human Motion Prediction

URL: http://arxiv.org/abs/2105.09711v1
Date: Thu, 20 May 2021 12:51:39 GMT
Title: An Attractor-Guided Neural Networks for Skeleton-Based Human Motion Prediction
Authors: Pengxiang Ding and Jianqin Yin
Abstract summary: Joint modeling is a curial component in human motion prediction. We learn a medium, called balance attractor (BA), fromtemporal features to characterize the global motion features. Through the BA, all joints are related synchronously, and thus the global coordination of all joints can be better learned.
Score: 0.4568777157687961
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Joint relation modeling is a curial component in human motion prediction. Most existing methods tend to design skeletal-based graphs to build the relations among joints, where local interactions between joint pairs are well learned. However, the global coordination of all joints, which reflects human motion's balance property, is usually weakened because it is learned from part to whole progressively and asynchronously. Thus, the final predicted motions are sometimes unnatural. To tackle this issue, we learn a medium, called balance attractor (BA), from the spatiotemporal features of motion to characterize the global motion features, which is subsequently used to build new joint relations. Through the BA, all joints are related synchronously, and thus the global coordination of all joints can be better learned. Based on the BA, we propose our framework, referred to Attractor-Guided Neural Network, mainly including Attractor-Based Joint Relation Extractor (AJRE) and Multi-timescale Dynamics Extractor (MTDE). The AJRE mainly includes Global Coordination Extractor (GCE) and Local Interaction Extractor (LIE). The former presents the global coordination of all joints, and the latter encodes local interactions between joint pairs. The MTDE is designed to extract dynamic information from raw position information for effective prediction. Extensive experiments show that the proposed framework outperforms state-of-the-art methods in both short and long-term predictions in H3.6M, CMU-Mocap, and 3DPW.

Related papers

Text-Derived Relational Graph-Enhanced Network for Skeleton-Based Action Segmentation [14.707224594220264]
We propose a Text-Derived Graph Network (TRG-Net) to enhance both modeling and supervision. For modeling, the Dynamic Spatio-Temporal Fusion Modeling (D) method incorporates Text-Derived Joint Graphs (JGT) with channel adaptation. For supervision, the Absolute-Relative Inter-Class Supervision (ARIS) method employs contrastive learning between action features and text embeddings to regularize the absolute class.
arXiv Detail & Related papers (2025-03-19T11:38:14Z)
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation [25.777159581915658]
ChainHOI is a novel approach for text-driven human-object interaction generation. It explicitly models interactions at both the joint and kinetic chain levels.
arXiv Detail & Related papers (2025-03-17T12:55:34Z)
Relation Learning and Aggregate-attention for Multi-person Motion Prediction [13.052342503276936]
Multi-person motion prediction considers not just the skeleton structures or human trajectories but also the interactions between others. Previous methods often overlook that the joints relations within an individual (intra-relation) and interactions among groups (inter-relation) are distinct types of representations. We introduce a new collaborative framework for multi-person motion prediction that explicitly modeling these relations.
arXiv Detail & Related papers (2024-11-06T07:48:30Z)
Towards more realistic human motion prediction with attention to motion coordination [7.243632426715939]
We propose a novel joint relation modeling module, Comprehensive Joint Relation Extractor (CJRE), to combine this motion coordination with the local interactions between joint pairs in a unified manner. The proposed framework outperforms state-of-the-art methods in both short- and long-term predictions on H3.6M, CMU-Mocap, and 3DPW.
arXiv Detail & Related papers (2024-04-04T16:48:40Z)
Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition [22.538114033191313]
We propose a mutual excitation graph convolutional network (me-GCN) by stacking mutual excitation graph convolution layers. Me-GC learns mutual information in each layer and each stage of graph convolution operations. Our proposed me-GC outperforms state-of-the-art GCN-based and Transformer-based methods.
arXiv Detail & Related papers (2024-02-04T10:00:00Z)
APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation [86.29366168836141]
We propose a graph-driven framework, named Adaptive and Personalized Graph Learning for Sequential Recommendation (APGL4SR) APGL4SR incorporates adaptive and personalized global collaborative information into sequential recommendation systems. As a generic framework, APGL4SR can outperform other baselines with significant margins.
arXiv Detail & Related papers (2023-11-06T01:33:24Z)
Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction [121.65152276851619]
We show that semantic correlations between relations are inherently edge-level and entity-independent. We propose a novel subgraph-based method, namely TACO, to model Topology-Aware COrrelations between relations. To further exploit the potential of RCN, we propose Complete Common Neighbor induced subgraph.
arXiv Detail & Related papers (2023-09-20T08:11:58Z)
Joint-Relation Transformer for Multi-Person Motion Prediction [79.08243886832601]
We propose the Joint-Relation Transformer to enhance interaction modeling. Our method achieves a 13.4% improvement of 900ms VIM on 3DPW-SoMoF/RC and 17.8%/12.0% improvement of 3s MPJPE.
arXiv Detail & Related papers (2023-08-09T09:02:47Z)
Kinematics Modeling Network for Video-based Human Pose Estimation [9.506011491028891]
Estimating human poses from videos is critical in human-computer interaction. Joints cooperate rather than move independently during human movement. We propose a plug-and-play kinematics modeling module (KMM) to explicitly model temporal correlations between joints.
arXiv Detail & Related papers (2022-07-22T09:37:48Z)
Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning [23.051184131833292]
We propose a new transformer model for the task of unsupervised learning of skeleton motion sequences. The proposed model successfully learns local dynamics of the joints and captures global context from the motion sequences.
arXiv Detail & Related papers (2022-07-13T10:18:07Z)
Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z)
SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition. We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors. Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z)
Pose And Joint-Aware Action Recognition [87.4780883700755]
We present a new model for joint-based action recognition, which first extracts motion features from each joint separately through a shared motion encoder. Our joint selector module re-weights the joint information to select the most discriminative joints for the task. We show large improvements over the current state-of-the-art joint-based approaches on JHMDB, HMDB, Charades, AVA action recognition datasets.
arXiv Detail & Related papers (2020-10-16T04:43:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.