Forecasting Characteristic 3D Poses of Human Actions
- URL: http://arxiv.org/abs/2011.15079v2
- Date: Wed, 7 Apr 2021 17:58:08 GMT
- Title: Forecasting Characteristic 3D Poses of Human Actions
- Authors: Christian Diller, Thomas Funkhouser, Angela Dai
- Abstract summary: We propose the task of forecasting characteristic 3D poses from a monocular video observation of a person to predict a future 3D pose of that person in a likely action-defining, characteristic pose.
We define a semantically meaningful pose prediction task that decouples the predicted pose from time, taking inspiration from goal-directed behavior.
Our experiments with this dataset suggest that our proposed probabilistic approach outperforms state-of-the-art methods by 22% on average.
- Score: 24.186058965796157
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose the task of forecasting characteristic 3D poses: from a monocular
video observation of a person, to predict a future 3D pose of that person in a
likely action-defining, characteristic pose - for instance, from observing a
person reaching for a banana, predict the pose of the person eating the banana.
Prior work on human motion prediction estimates future poses at fixed time
intervals. Although easy to define, this frame-by-frame formulation confounds
temporal and intentional aspects of human action. Instead, we define a
semantically meaningful pose prediction task that decouples the predicted pose
from time, taking inspiration from goal-directed behavior. To predict
characteristic poses, we propose a probabilistic approach that first models the
possible multi-modality in the distribution of likely characteristic poses. It
then samples future pose hypotheses from the predicted distribution in an
autoregressive fashion to model dependencies between joints and finally
optimizes the resulting pose with bone length and angle constraints. To
evaluate our method, we construct a dataset of manually annotated
characteristic 3D poses. Our experiments with this dataset suggest that our
proposed probabilistic approach outperforms state-of-the-art methods by 22% on
average.
Related papers
- TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting [27.3359362364858]
We present an efficient multi-view pose estimation model that learns a robust temporal representation.
Our model is able to generalize across datasets without fine-tuning.
arXiv Detail & Related papers (2023-09-14T17:56:30Z) - Self-supervised 3D Human Pose Estimation from a Single Image [1.0878040851638]
We propose a new self-supervised method for predicting 3D human body pose from a single image.
The prediction network is trained from a dataset of unlabelled images depicting people in typical poses and a set of unpaired 2D poses.
arXiv Detail & Related papers (2023-04-05T10:26:21Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation [13.40702053084305]
We present a temporally embedded 3D human body pose and shape estimation (TePose) method to improve the accuracy and temporal consistency pose in live stream videos.
A multi-scale convolutional network is presented as the motion discriminator for adversarial training using datasets without any 3D labeling.
arXiv Detail & Related papers (2022-07-25T21:21:59Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Self-Attentive 3D Human Pose and Shape Estimation from Videos [82.63503361008607]
We present a video-based learning algorithm for 3D human pose and shape estimation.
We exploit temporal information in videos and propose a self-attention module.
We evaluate our method on the 3DPW, MPI-INF-3DHP, and Human3.6M datasets.
arXiv Detail & Related papers (2021-03-26T00:02:19Z) - Long Term Motion Prediction Using Keyposes [122.22758311506588]
We argue that, to achieve long term forecasting, predicting human pose at every time instant is unnecessary.
We call such poses "keyposes", and approximate complex motions by linearly interpolating between subsequent keyposes.
We show that learning the sequence of such keyposes allows us to predict very long term motion, up to 5 seconds in the future.
arXiv Detail & Related papers (2020-12-08T20:45:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.