Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- URL: http://arxiv.org/abs/2401.00847v2
- Date: Mon, 6 May 2024 08:14:01 GMT
- Title: Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- Authors: Jiye Lee, Hanbyul Joo,
- Abstract summary: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera.
Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments.
- Score: 10.055317239956423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera. In contrast to the existing approaches that use six or more expert-level IMU devices, our approach is much more cost-effective and convenient. Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments. As a key idea to overcome the extreme sparsity and ambiguities of sensor inputs with different modalities, we integrate 6D head poses obtained from the head-mounted cameras for motion estimation. To enable capture in expansive indoor and outdoor scenes, we propose an algorithm to track and update floor level changes to define head poses, coupled with a multi-stage Transformer-based regression module. We also introduce novel strategies leveraging visual cues of egocentric images to further enhance the motion capture quality while reducing ambiguities. We demonstrate the performance of our method on various challenging scenarios, including complex outdoor environments and everyday motions including object interactions and social interactions among multiple individuals.
Related papers
- Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes [83.55301458112672]
Sitcom-Crafter is a system for human motion generation in 3D space.
Central to the function generation modules is our novel 3D scene-aware human-human interaction module.
Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types.
arXiv Detail & Related papers (2024-10-14T17:56:19Z) - Redundancy-Aware Camera Selection for Indoor Scene Neural Rendering [54.468355408388675]
We build a similarity matrix that incorporates both the spatial diversity of the cameras and the semantic variation of the images.
We apply a diversity-based sampling algorithm to optimize the camera selection.
We also develop a new dataset, IndoorTraj, which includes long and complex camera movements captured by humans in virtual indoor environments.
arXiv Detail & Related papers (2024-09-11T08:36:49Z) - Motion Capture from Inertial and Vision Sensors [60.5190090684795]
MINIONS is a large-scale Motion capture dataset collected from INertial and visION Sensors.
We conduct experiments on multi-modal motion capture using a monocular camera and very few IMUs.
arXiv Detail & Related papers (2024-07-23T09:41:10Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - PACE: Human and Camera Motion Estimation from in-the-wild Videos [113.76041632912577]
We present a method to estimate human motion in a global scene from moving cameras.
This is a highly challenging task due to the coupling of human and camera motions in the video.
We propose a joint optimization framework that disentangles human and camera motions using both foreground human motion priors and background scene features.
arXiv Detail & Related papers (2023-10-20T19:04:14Z) - Proactive Multi-Camera Collaboration For 3D Human Pose Estimation [16.628446718419344]
This paper presents a multi-agent reinforcement learning scheme for proactive Multi-Camera Collaboration in 3D Human Pose Estimation.
Active camera approaches proactively control camera poses to find optimal viewpoints for 3D reconstruction.
We jointly train our model with multiple world dynamics learning tasks to better capture environment dynamics.
arXiv Detail & Related papers (2023-03-07T10:01:00Z) - HybridCap: Inertia-aid Monocular Capture of Challenging Human Motions [41.56735523771541]
We present a light-weight, hybrid mocap technique called HybridCap.
It augments the camera with only 4 Inertial Measurement Units (IMUs) in a learning-and-optimization framework.
It can robustly handle challenging movements ranging from fitness actions to Latin dance.
arXiv Detail & Related papers (2022-03-17T12:30:17Z) - Event-based Motion Segmentation by Cascaded Two-Level Multi-Model
Fitting [44.97191206895915]
We present a cascaded two-level multi-model fitting method for identifying independently moving objects with a monocular event camera.
Experiments demonstrate the effectiveness and versatility of our method in real-world scenes with different motion patterns and an unknown number of moving objects.
arXiv Detail & Related papers (2021-11-05T12:59:41Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - Lightweight Multi-person Total Motion Capture Using Sparse Multi-view
Cameras [35.67288909201899]
We propose a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras.
Our method is capable of efficient localization and accurate association of the hands and faces even on severe occluded occasions.
Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance.
arXiv Detail & Related papers (2021-08-23T19:23:35Z) - SportsCap: Monocular 3D Human Motion Capture and Fine-grained
Understanding in Challenging Sports Videos [40.19723456533343]
We propose SportsCap -- the first approach for simultaneously capturing 3D human motions and understanding fine-grained actions from monocular challenging sports video input.
Our approach utilizes the semantic and temporally structured sub-motion prior in the embedding space for motion capture and understanding.
Based on such hybrid motion information, we introduce a multi-stream spatial-temporal Graph Convolutional Network(ST-GCN) to predict the fine-grained semantic action attributes.
arXiv Detail & Related papers (2021-04-23T07:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.