Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery
- URL: http://arxiv.org/abs/2407.00574v2
- Date: Thu, 12 Dec 2024 12:37:32 GMT
- Title: Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery
- Authors: Fengyuan Yang, Kerui Gu, Ha Linh Nguyen, Tze Ho Elden Tse, Angela Yao,
- Abstract summary: This paper presents an optimization-free scale calibration framework, Human as Checkerboard (HAC)
HAC innovatively leverages the human body predicted by human mesh recovery model as a calibration reference.
Our method sets a new state-of-the-art performance for global human mesh estimation tasks.
- Score: 32.379298416414436
- License:
- Abstract: Accurate camera motion estimation is essential for recovering global human motion in world coordinates from RGB video inputs. SLAM is widely used for estimating camera trajectory and point cloud, but monocular SLAM does so only up to an unknown scale factor. Previous works estimate the scale factor through optimization, but this is unreliable and time-consuming. This paper presents an optimization-free scale calibration framework, Human as Checkerboard (HAC). HAC innovatively leverages the human body predicted by human mesh recovery model as a calibration reference. Specifically, it uses the absolute depth of human-scene contact joints as references to calibrate the corresponding relative scene depth from SLAM. HAC benefits from geometric priors encoded in human mesh recovery models to estimate the SLAM scale and achieves precise global human motion estimation. Simple yet powerful, our method sets a new state-of-the-art performance for global human mesh estimation tasks, reducing motion errors by 50% over prior local-to-global methods while using 100$\times$ less inference time than optimization-based methods. Project page: https://martayang.github.io/HAC.
Related papers
- Reconstructing People, Places, and Cameras [57.81696692335401]
"Humans and Structure from Motion" (HSfM) is a method for jointly reconstructing multiple human meshes, scene point clouds, and camera parameters in a metric world coordinate system.
Our results show that incorporating human data into the SfM pipeline improves camera pose estimation.
arXiv Detail & Related papers (2024-12-23T18:58:34Z) - Estimating Body and Hand Motion in an Ego-sensed World [62.61989004520802]
We present EgoAllo, a system for human motion estimation from a head-mounted device.
Using only egocentric SLAM poses and images, EgoAllo guides sampling from a conditional diffusion model to estimate 3D body pose, height, and hand parameters.
arXiv Detail & Related papers (2024-10-04T17:59:57Z) - COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions.
COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z) - Aligning Human Motion Generation with Human Perceptions [51.831338643012444]
We propose a data-driven approach to bridge the gap by introducing a large-scale human perceptual evaluation dataset, MotionPercept, and a human motion critic model, MotionCritic.
Our critic model offers a more accurate metric for assessing motion quality and could be readily integrated into the motion generation pipeline.
arXiv Detail & Related papers (2024-07-02T14:01:59Z) - WHAC: World-grounded Humans and Cameras [37.877565981937586]
We aim to recover expressive parametric human models (i.e., SMPL-X) and corresponding camera poses jointly.
We introduce a novel framework, referred to as WHAC, to facilitate world-grounded expressive human pose and shape estimation.
We present a new synthetic dataset, WHAC-A-Mole, which includes accurately annotated humans and cameras.
arXiv Detail & Related papers (2024-03-19T17:58:02Z) - WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion [43.95997922499137]
WHAM (World-grounded Humans with Accurate Motion) reconstructs 3D human motion in a global coordinate system from video.
Uses camera angular velocity estimated from a SLAM method together with human motion to estimate the body's global trajectory.
outperforms all existing 3D human motion recovery methods across multiple in-the-wild benchmarks.
arXiv Detail & Related papers (2023-12-12T18:57:46Z) - W-HMR: Monocular Human Mesh Recovery in World Space with Weak-Supervised Calibration [57.37135310143126]
Previous methods for 3D motion recovery from monocular images often fall short due to reliance on camera coordinates.
We introduce W-HMR, a weak-supervised calibration method that predicts "reasonable" focal lengths based on body distortion information.
We also present the OrientCorrect module, which corrects body orientation for plausible reconstructions in world space.
arXiv Detail & Related papers (2023-11-29T09:02:07Z) - A Simple Method to Boost Human Pose Estimation Accuracy by Correcting
the Joint Regressor for the Human3.6m Dataset [21.096409769550387]
We show that the most widely used SMPL-to-joint linear layer (joint regressor) is inaccurate.
To achieve a more accurate joint regressor, we propose a method to create pseudo-ground-truth SMPL poses.
We show that our regressor leads to improved pose estimations results on the test set without any need for retraining.
arXiv Detail & Related papers (2022-04-29T20:42:48Z) - GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras [99.07219478953982]
We present an approach for 3D global human mesh recovery from monocular videos recorded with dynamic cameras.
We first propose a deep generative motion infiller, which autoregressively infills the body motions of occluded humans based on visible motions.
In contrast to prior work, our approach reconstructs human meshes in consistent global coordinates even with dynamic cameras.
arXiv Detail & Related papers (2021-12-02T18:59:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.