Related papers: MotionMap: Representing Multimodality in Human Pose Forecasting

MotionMap: Representing Multimodality in Human Pose Forecasting

URL: http://arxiv.org/abs/2412.18883v1
Date: Wed, 25 Dec 2024 11:47:26 GMT
Title: MotionMap: Representing Multimodality in Human Pose Forecasting
Authors: Reyhaneh Hosseininejad, Megh Shukla, Saeed Saadatnejad, Mathieu Salzmann, Alexandre Alahi,
Abstract summary: We propose an alternative paradigm to make the task well-posed. While state-of-the-art methods predict multimodality, this requires oversampling a large volume of predictions. We address these questions with MotionMap, a simple yet effective heatmap based representation for multimodality.
Score: 98.26350593416674
License:
Abstract: Human pose forecasting is inherently multimodal since multiple futures exist for an observed pose sequence. However, evaluating multimodality is challenging since the task is ill-posed. Therefore, we first propose an alternative paradigm to make the task well-posed. Next, while state-of-the-art methods predict multimodality, this requires oversampling a large volume of predictions. This raises key questions: (1) Can we capture multimodality by efficiently sampling a smaller number of predictions? (2) Subsequently, which of the predicted futures is more likely for an observed pose sequence? We address these questions with MotionMap, a simple yet effective heatmap based representation for multimodality. We extend heatmaps to represent a spatial distribution over the space of all possible motions, where different local maxima correspond to different forecasts for a given observation. MotionMap can capture a variable number of modes per observation and provide confidence measures for different modes. Further, MotionMap allows us to introduce the notion of uncertainty and controllability over the forecasted pose sequence. Finally, MotionMap captures rare modes that are non-trivial to evaluate yet critical for safety. We support our claims through multiple qualitative and quantitative experiments using popular 3D human pose datasets: Human3.6M and AMASS, highlighting the strengths and limitations of our proposed method. Project Page: https://www.epfl.ch/labs/vita/research/prediction/motionmap/

Related papers

Learning Snippet-to-Motion Progression for Skeleton-based Human Motion Prediction [14.988322340164391]
Existing Graph Convolutional Networks to achieve human motion prediction largely adopt a one-step scheme. We observe that human motions have transitional patterns and can be split into snippets representative of each transition. We propose a snippet-to-motion multi-stage framework that breaks motion prediction into sub-tasks easier to accomplish.
arXiv Detail & Related papers (2023-07-26T07:36:38Z)
Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors [21.915057426589744]
We propose a simple yet effective approach that disentangles randomly sampled codes with a deterministic learnable component named anchors to promote sample precision and diversity. In principle, our spatial-temporal anchor-based sampling (STARS) can be applied to different motion predictors.
arXiv Detail & Related papers (2023-02-09T18:58:07Z)
PREF: Predictability Regularized Neural Motion Fields [68.60019434498703]
Knowing 3D motions in a dynamic scene is essential to many vision applications. We leverage a neural motion field for estimating the motion of all points in a multiview setting. We propose to regularize the estimated motion to be predictable.
arXiv Detail & Related papers (2022-09-21T22:32:37Z)
Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet [24.852728097115744]
Multi-person pose understanding from RGB involves three complex tasks: pose estimation, tracking and motion forecasting. Most existing works either focus on a single task or employ multi-stage approaches to solving multiple tasks separately. We propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage.
arXiv Detail & Related papers (2022-07-09T18:42:14Z)
Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction. Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z)
Long Term Motion Prediction Using Keyposes [122.22758311506588]
We argue that, to achieve long term forecasting, predicting human pose at every time instant is unnecessary. We call such poses "keyposes", and approximate complex motions by linearly interpolating between subsequent keyposes. We show that learning the sequence of such keyposes allows us to predict very long term motion, up to 5 seconds in the future.
arXiv Detail & Related papers (2020-12-08T20:45:51Z)
Forecasting Characteristic 3D Poses of Human Actions [24.186058965796157]
We propose the task of forecasting characteristic 3D poses from a monocular video observation of a person to predict a future 3D pose of that person in a likely action-defining, characteristic pose. We define a semantically meaningful pose prediction task that decouples the predicted pose from time, taking inspiration from goal-directed behavior. Our experiments with this dataset suggest that our proposed probabilistic approach outperforms state-of-the-art methods by 22% on average.
arXiv Detail & Related papers (2020-11-30T18:20:17Z)
Motion Prediction Using Temporal Inception Module [96.76721173517895]
We propose a Temporal Inception Module (TIM) to encode human motion. Our framework produces input embeddings using convolutional layers, by using different kernel sizes for different input lengths. The experimental results on standard motion prediction benchmark datasets Human3.6M and CMU motion capture dataset show that our approach consistently outperforms the state of the art methods.
arXiv Detail & Related papers (2020-10-06T20:26:01Z)
SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction. multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.