TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D
Environments
- URL: http://arxiv.org/abs/2306.02850v2
- Date: Mon, 20 Nov 2023 13:04:59 GMT
- Title: TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D
Environments
- Authors: Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black
- Abstract summary: Current methods can't reliably estimate moving humans in global coordinates.
TRACE is the first one-stage method to jointly recover and track 3D humans in global coordinates from dynamic cameras.
It achieves state-of-the-art performance on tracking and HPS benchmarks.
- Score: 106.80978555346958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the estimation of 3D human pose and shape (HPS) is rapidly
progressing, current methods still cannot reliably estimate moving humans in
global coordinates, which is critical for many applications. This is
particularly challenging when the camera is also moving, entangling human and
camera motion. To address these issues, we adopt a novel 5D representation
(space, time, and identity) that enables end-to-end reasoning about people in
scenes. Our method, called TRACE, introduces several novel architectural
components. Most importantly, it uses two new "maps" to reason about the 3D
trajectory of people over time in camera, and world, coordinates. An additional
memory unit enables persistent tracking of people even during long occlusions.
TRACE is the first one-stage method to jointly recover and track 3D humans in
global coordinates from dynamic cameras. By training it end-to-end, and using
full image information, TRACE achieves state-of-the-art performance on tracking
and HPS benchmarks. The code and dataset are released for research purposes.
Related papers
- Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion [43.95997922499137]
WHAM (World-grounded Humans with Accurate Motion) reconstructs 3D human motion in a global coordinate system from video.
Uses camera angular velocity estimated from a SLAM method together with human motion to estimate the body's global trajectory.
outperforms all existing 3D human motion recovery methods across multiple in-the-wild benchmarks.
arXiv Detail & Related papers (2023-12-12T18:57:46Z) - Decoupling Human and Camera Motion from Videos in the Wild [67.39432972193929]
We propose a method to reconstruct global human trajectories from videos in the wild.
Our method decouples the camera and human motion, which allows us to place people in the same world coordinate frame.
arXiv Detail & Related papers (2023-02-24T18:59:15Z) - HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor
Space Using Wearable IMUs and LiDAR [51.9200422793806]
Using only body-mounted IMUs and LiDAR, HSC4D is space-free without any external devices' constraints and map-free without pre-built maps.
Relationships between humans and environments are also explored to make their interaction more realistic.
arXiv Detail & Related papers (2022-03-17T10:05:55Z) - Human-Aware Object Placement for Visual Environment Reconstruction [63.14733166375534]
We show that human-scene interactions can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.
Our key idea is that, as a person moves through a scene and interacts with it, we accumulate HSIs across multiple input images.
We show that our scene reconstruction can be used to refine the initial 3D human pose and shape estimation.
arXiv Detail & Related papers (2022-03-07T18:59:02Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated
Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D.
A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory.
Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z) - AnimePose: Multi-person 3D pose estimation and animation [9.323689681059504]
3D animation of humans in action is quite challenging as it involves using a huge setup with several motion trackers all over the person's body to track the movements of every limb.
This is time-consuming and may cause the person discomfort in wearing exoskeleton body suits with motion sensors.
We present a solution to generate 3D animation of multiple persons from a 2D video using deep learning.
arXiv Detail & Related papers (2020-02-06T11:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.