Comparison of Spatio-Temporal Models for Human Motion and Pose
Forecasting in Face-to-Face Interaction Scenarios
- URL: http://arxiv.org/abs/2203.03245v1
- Date: Mon, 7 Mar 2022 09:59:30 GMT
- Title: Comparison of Spatio-Temporal Models for Human Motion and Pose
Forecasting in Face-to-Face Interaction Scenarios
- Authors: German Barquero and Johnny N\'u\~nez and Zhen Xu and Sergio Escalera
and Wei-Wei Tu and Isabelle Guyon and Cristina Palmero
- Abstract summary: We present the first systematic comparison of state-of-the-art approaches for behavior forecasting.
Our best attention-based approaches achieve state-of-the-art performance in UDIVA v0.5.
We show that by autoregressively predicting the future with methods trained for the short-term future, we outperform the baselines even for a considerably longer-term future.
- Score: 47.99589136455976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human behavior forecasting during human-human interactions is of utmost
importance to provide robotic or virtual agents with social intelligence. This
problem is especially challenging for scenarios that are highly driven by
interpersonal dynamics. In this work, we present the first systematic
comparison of state-of-the-art approaches for behavior forecasting. To do so,
we leverage whole-body annotations (face, body, and hands) from the very
recently released UDIVA v0.5, which features face-to-face dyadic interactions.
Our best attention-based approaches achieve state-of-the-art performance in
UDIVA v0.5. We show that by autoregressively predicting the future with methods
trained for the short-term future (<400ms), we outperform the baselines even
for a considerably longer-term future (up to 2s). We also show that this
finding holds when highly noisy annotations are used, which opens new horizons
towards the use of weakly-supervised learning. Combined with large-scale
datasets, this may help boost the advances in this field.
Related papers
- Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Qualitative Prediction of Multi-Agent Spatial Interactions [5.742409080817885]
We present and benchmark three new approaches to model and predict multi-agent interactions in dense scenes.
The proposed solutions take into account static and dynamic context to predict individual interactions.
They exploit an input- and a temporal-attention mechanism, and are tested on medium and long-term time horizons.
arXiv Detail & Related papers (2023-06-30T18:08:25Z) - Multi-Timescale Modeling of Human Behavior [0.18199355648379031]
We propose an LSTM network architecture that processes behavioral information at multiple timescales to predict future behavior.
We evaluate our architecture on data collected in an urban search and rescue scenario simulated in a virtual Minecraft-based testbed.
arXiv Detail & Related papers (2022-11-16T15:58:57Z) - Using Features at Multiple Temporal and Spatial Resolutions to Predict
Human Behavior in Real Time [2.955419572714387]
We present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time.
Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously.
arXiv Detail & Related papers (2022-11-12T18:41:33Z) - Data-Efficient Reinforcement Learning with Self-Predictive
Representations [21.223069189953037]
We train an agent to predict its own latent state representations multiple steps into the future.
On its own, this future prediction objective outperforms prior methods for sample-efficient deep RL from pixels.
Our full self-supervised objective, which combines future prediction and data augmentation, achieves a median human-normalized score of 0.415 on Atari.
arXiv Detail & Related papers (2020-07-12T07:38:15Z) - Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions.
We propose two knowledge-based data-driven methods to effectively capture these social interactions.
We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.