Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- URL: http://arxiv.org/abs/2401.00847v2
- Date: Mon, 6 May 2024 08:14:01 GMT
- Title: Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera
- Authors: Jiye Lee, Hanbyul Joo,
- Abstract summary: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera.
Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments.
- Score: 10.055317239956423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a lightweight and affordable motion capture method based on two smartwatches and a head-mounted camera. In contrast to the existing approaches that use six or more expert-level IMU devices, our approach is much more cost-effective and convenient. Our method can make wearable motion capture accessible to everyone everywhere, enabling 3D full-body motion capture in diverse environments. As a key idea to overcome the extreme sparsity and ambiguities of sensor inputs with different modalities, we integrate 6D head poses obtained from the head-mounted cameras for motion estimation. To enable capture in expansive indoor and outdoor scenes, we propose an algorithm to track and update floor level changes to define head poses, coupled with a multi-stage Transformer-based regression module. We also introduce novel strategies leveraging visual cues of egocentric images to further enhance the motion capture quality while reducing ambiguities. We demonstrate the performance of our method on various challenging scenarios, including complex outdoor environments and everyday motions including object interactions and social interactions among multiple individuals.
Related papers
- Motion Capture from Inertial and Vision Sensors [60.5190090684795]
MINIONS is a large-scale Motion capture dataset collected from INertial and visION Sensors.
We conduct experiments on multi-modal motion capture using a monocular camera and very few IMUs.
arXiv Detail & Related papers (2024-07-23T09:41:10Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
We introduce EgoGaussian, the first method capable of simultaneously reconstructing 3D scenes and tracking 3D object motion from RGB egocentric input alone.
Our approach employs a clip-level online learning pipeline that leverages the dynamic nature of human activities.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - MotionMaster: Training-free Camera Motion Transfer For Video Generation [48.706578330771386]
We propose a novel training-free video motion transfer model, which disentangles camera motions and object motions in source videos.
Our model can effectively decouple camera-object motion and apply the decoupled camera motion to a wide range of controllable video generation tasks.
arXiv Detail & Related papers (2024-04-24T10:28:54Z) - PACE: Human and Camera Motion Estimation from in-the-wild Videos [113.76041632912577]
We present a method to estimate human motion in a global scene from moving cameras.
This is a highly challenging task due to the coupling of human and camera motions in the video.
We propose a joint optimization framework that disentangles human and camera motions using both foreground human motion priors and background scene features.
arXiv Detail & Related papers (2023-10-20T19:04:14Z) - Proactive Multi-Camera Collaboration For 3D Human Pose Estimation [16.628446718419344]
This paper presents a multi-agent reinforcement learning scheme for proactive Multi-Camera Collaboration in 3D Human Pose Estimation.
Active camera approaches proactively control camera poses to find optimal viewpoints for 3D reconstruction.
We jointly train our model with multiple world dynamics learning tasks to better capture environment dynamics.
arXiv Detail & Related papers (2023-03-07T10:01:00Z) - HybridCap: Inertia-aid Monocular Capture of Challenging Human Motions [41.56735523771541]
We present a light-weight, hybrid mocap technique called HybridCap.
It augments the camera with only 4 Inertial Measurement Units (IMUs) in a learning-and-optimization framework.
It can robustly handle challenging movements ranging from fitness actions to Latin dance.
arXiv Detail & Related papers (2022-03-17T12:30:17Z) - Event-based Motion Segmentation by Cascaded Two-Level Multi-Model
Fitting [44.97191206895915]
We present a cascaded two-level multi-model fitting method for identifying independently moving objects with a monocular event camera.
Experiments demonstrate the effectiveness and versatility of our method in real-world scenes with different motion patterns and an unknown number of moving objects.
arXiv Detail & Related papers (2021-11-05T12:59:41Z) - Estimating 3D Motion and Forces of Human-Object Interactions from
Internet Videos [49.52070710518688]
We introduce a method to reconstruct the 3D motion of a person interacting with an object from a single RGB video.
Our method estimates the 3D poses of the person together with the object pose, the contact positions and the contact forces on the human body.
arXiv Detail & Related papers (2021-11-02T13:40:18Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - Lightweight Multi-person Total Motion Capture Using Sparse Multi-view
Cameras [35.67288909201899]
We propose a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras.
Our method is capable of efficient localization and accurate association of the hands and faces even on severe occluded occasions.
Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance.
arXiv Detail & Related papers (2021-08-23T19:23:35Z) - SportsCap: Monocular 3D Human Motion Capture and Fine-grained
Understanding in Challenging Sports Videos [40.19723456533343]
We propose SportsCap -- the first approach for simultaneously capturing 3D human motions and understanding fine-grained actions from monocular challenging sports video input.
Our approach utilizes the semantic and temporally structured sub-motion prior in the embedding space for motion capture and understanding.
Based on such hybrid motion information, we introduce a multi-stream spatial-temporal Graph Convolutional Network(ST-GCN) to predict the fine-grained semantic action attributes.
arXiv Detail & Related papers (2021-04-23T07:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.