Spatio-temporal Tendency Reasoning for Human Body Pose and Shape
Estimation from Videos
- URL: http://arxiv.org/abs/2210.03659v2
- Date: Mon, 10 Oct 2022 03:24:48 GMT
- Title: Spatio-temporal Tendency Reasoning for Human Body Pose and Shape
Estimation from Videos
- Authors: Boyang Zhang, SuPing Wu, Hu Cao, Kehua Ma, Pan Li, Lei Lin
- Abstract summary: We present atemporal tendency reasoning (STR) network for recovering human body pose shape from videos.
Our STR aims to learn accurate and spatial motion sequences in an unconstrained environment.
Our STR remains competitive with the state-of-the-art on three datasets.
- Score: 10.50306784245168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a spatio-temporal tendency reasoning (STR) network
for recovering human body pose and shape from videos. Previous approaches have
focused on how to extend 3D human datasets and temporal-based learning to
promote accuracy and temporal smoothing. Different from them, our STR aims to
learn accurate and natural motion sequences in an unconstrained environment
through temporal and spatial tendency and to fully excavate the spatio-temporal
features of existing video data. To this end, our STR learns the representation
of features in the temporal and spatial dimensions respectively, to concentrate
on a more robust representation of spatio-temporal features. More specifically,
for efficient temporal modeling, we first propose a temporal tendency reasoning
(TTR) module. TTR constructs a time-dimensional hierarchical residual
connection representation within a video sequence to effectively reason
temporal sequences' tendencies and retain effective dissemination of human
information. Meanwhile, for enhancing the spatial representation, we design a
spatial tendency enhancing (STE) module to further learns to excite spatially
time-frequency domain sensitive features in human motion information
representations. Finally, we introduce integration strategies to integrate and
refine the spatio-temporal feature representations. Extensive experimental
findings on large-scale publically available datasets reveal that our STR
remains competitive with the state-of-the-art on three datasets. Our code are
available at https://github.com/Changboyang/STR.git.
Related papers
- STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video [7.345621536750547]
This paper presents a graph-based framework for 3D human pose estimation in video.
Specifically, we develop a graph-based attention mechanism, integrating graph information directly into the respective attention layers.
We demonstrate that our method achieves significant stateof-the-art performance in 3D human pose estimation.
arXiv Detail & Related papers (2024-07-14T06:45:27Z) - Jointly spatial-temporal representation learning for individual
trajectories [30.318791393724524]
This paper proposes a spatial-temporal joint representation learning method (ST-GraphRL) to formalize learnable spatial-temporal dependencies into trajectory representations.
Tested on three real-world human mobility datasets, the proposed ST-GraphRL outperformed all the baseline models in predicting movement spatial-temporal distributions and preserving trajectory similarity with high spatial-temporal correlations.
arXiv Detail & Related papers (2023-12-07T05:27:24Z) - Triplet Attention Transformer for Spatiotemporal Predictive Learning [9.059462850026216]
We propose an innovative triplet attention transformer designed to capture both inter-frame dynamics and intra-frame static features.
The model incorporates the Triplet Attention Module (TAM), which replaces traditional recurrent units by exploring self-attention mechanisms in temporal, spatial, and channel dimensions.
arXiv Detail & Related papers (2023-10-28T12:49:33Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - Deeply-Coupled Convolution-Transformer with Spatial-temporal
Complementary Learning for Video-based Person Re-identification [91.56939957189505]
We propose a novel spatial-temporal complementary learning framework named Deeply-Coupled Convolution-Transformer (DCCT) for high-performance video-based person Re-ID.
Our framework could attain better performances than most state-of-the-art methods.
arXiv Detail & Related papers (2023-04-27T12:16:44Z) - STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond [78.129039340528]
We propose a temporal-aware unit (STAU) for video prediction and beyond.
Our STAU can outperform other methods on all tasks in terms of performance and efficiency.
arXiv Detail & Related papers (2022-04-20T13:42:51Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - TPCN: Temporal Point Cloud Networks for Motion Forecasting [47.829152433166016]
We propose a novel framework with joint spatial and temporal learning for trajectory prediction.
In the spatial dimension, agents can be viewed as an unordered point set, and thus it is straightforward to apply point cloud learning techniques to model agents' locations.
Experiments on the Argoverse motion forecasting benchmark show that our approach achieves the state-of-the-art results.
arXiv Detail & Related papers (2021-03-04T14:44:32Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z) - A Graph Attention Spatio-temporal Convolutional Network for 3D Human
Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms.
Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.