Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- URL: http://arxiv.org/abs/2401.00847v2
- Date: Mon, 6 May 2024 08:14:01 GMT
- Title: Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- Authors: Jiye Lee, Hanbyul Joo,
- Abstract summary: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera.
Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments.
- Score: 10.055317239956423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera. In contrast to the existing approaches that use six or more expert-level IMU devices, our approach is much more cost-effective and convenient. Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments. As a key idea to overcome the extreme sparsity and ambiguities of sensor inputs with different modalities, we integrate 6D head poses obtained from the head-mounted cameras for motion estimation. To enable capture in expansive indoor and outdoor scenes, we propose an algorithm to track and update floor level changes to define head poses, coupled with a multi-stage Transformer-based regression module. We also introduce novel strategies leveraging visual cues of egocentric images to further enhance the motion capture quality while reducing ambiguities. We demonstrate the performance of our method on various challenging scenarios, including complex outdoor environments and everyday motions including object interactions and social interactions among multiple individuals.
Related papers
- UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control [17.039951897703645]
We introduce UniAvatar, a method that provides extensive control over a wide range of motion and illumination conditions.
Specifically, we use the FLAME model to render all motion information onto a single image, maintaining the integrity of 3D motion details.
We design independent modules to manage both 3D motion and illumination, permitting separate and combined control.
arXiv Detail & Related papers (2024-12-26T07:39:08Z) - A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions [56.709280823844374]
We introduce a mask-based motion correction module (MCM) that leverages motion context and video mask to repair flawed motions.
We also propose a physics-based motion transfer module (PTM), which employs a pretrain and adapt approach for motion imitation.
Our approach is designed as a plug-and-play module to physically refine the video motion capture results, including high-difficulty in-the-wild motions.
arXiv Detail & Related papers (2024-12-23T08:26:00Z) - Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera [49.82535393220003]
Dyn-HaMR is the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild.
We show that our approach significantly outperforms state-of-the-art methods in terms of 4D global mesh recovery.
This establishes a new benchmark for hand motion reconstruction from monocular video with moving cameras.
arXiv Detail & Related papers (2024-12-17T12:43:10Z) - Redundancy-Aware Camera Selection for Indoor Scene Neural Rendering [54.468355408388675]
We build a similarity matrix that incorporates both the spatial diversity of the cameras and the semantic variation of the images.
We apply a diversity-based sampling algorithm to optimize the camera selection.
We also develop a new dataset, IndoorTraj, which includes long and complex camera movements captured by humans in virtual indoor environments.
arXiv Detail & Related papers (2024-09-11T08:36:49Z) - Motion Capture from Inertial and Vision Sensors [60.5190090684795]
MINIONS is a large-scale Motion capture dataset collected from INertial and visION Sensors.
We conduct experiments on multi-modal motion capture using a monocular camera and very few IMUs.
arXiv Detail & Related papers (2024-07-23T09:41:10Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - PACE: Human and Camera Motion Estimation from in-the-wild Videos [113.76041632912577]
We present a method to estimate human motion in a global scene from moving cameras.
This is a highly challenging task due to the coupling of human and camera motions in the video.
We propose a joint optimization framework that disentangles human and camera motions using both foreground human motion priors and background scene features.
arXiv Detail & Related papers (2023-10-20T19:04:14Z) - HybridCap: Inertia-aid Monocular Capture of Challenging Human Motions [41.56735523771541]
We present a light-weight, hybrid mocap technique called HybridCap.
It augments the camera with only 4 Inertial Measurement Units (IMUs) in a learning-and-optimization framework.
It can robustly handle challenging movements ranging from fitness actions to Latin dance.
arXiv Detail & Related papers (2022-03-17T12:30:17Z) - Event-based Motion Segmentation by Cascaded Two-Level Multi-Model
Fitting [44.97191206895915]
We present a cascaded two-level multi-model fitting method for identifying independently moving objects with a monocular event camera.
Experiments demonstrate the effectiveness and versatility of our method in real-world scenes with different motion patterns and an unknown number of moving objects.
arXiv Detail & Related papers (2021-11-05T12:59:41Z) - Lightweight Multi-person Total Motion Capture Using Sparse Multi-view
Cameras [35.67288909201899]
We propose a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras.
Our method is capable of efficient localization and accurate association of the hands and faces even on severe occluded occasions.
Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance.
arXiv Detail & Related papers (2021-08-23T19:23:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.