MOVIN: Real-time Motion Capture using a Single LiDAR
- URL: http://arxiv.org/abs/2309.09314v1
- Date: Sun, 17 Sep 2023 16:04:15 GMT
- Title: MOVIN: Real-time Motion Capture using a Single LiDAR
- Authors: Deok-Kyeong Jang, Dongseok Yang, Deok-Yun Jang, Byeoli Choi, Taeil
Jin, and Sung-Hee Lee
- Abstract summary: We present MOVIN, the data-driven generative method for real-time motion capture with global tracking.
Our framework accurately predicts the performer's 3D global information and local joint details.
We implement a real-time application to showcase our method in real-world scenarios.
- Score: 7.3228874258537875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in technology have brought forth new forms of interactive
applications, such as the social metaverse, where end users interact with each
other through their virtual avatars. In such applications, precise full-body
tracking is essential for an immersive experience and a sense of embodiment
with the virtual avatar. However, current motion capture systems are not easily
accessible to end users due to their high cost, the requirement for special
skills to operate them, or the discomfort associated with wearable devices. In
this paper, we present MOVIN, the data-driven generative method for real-time
motion capture with global tracking, using a single LiDAR sensor. Our
autoregressive conditional variational autoencoder (CVAE) model learns the
distribution of pose variations conditioned on the given 3D point cloud from
LiDAR.As a central factor for high-accuracy motion capture, we propose a novel
feature encoder to learn the correlation between the historical 3D point cloud
data and global, local pose features, resulting in effective learning of the
pose prior. Global pose features include root translation, rotation, and foot
contacts, while local features comprise joint positions and rotations.
Subsequently, a pose generator takes into account the sampled latent variable
along with the features from the previous frame to generate a plausible current
pose. Our framework accurately predicts the performer's 3D global information
and local joint details while effectively considering temporally coherent
movements across frames. We demonstrate the effectiveness of our architecture
through quantitative and qualitative evaluations, comparing it against
state-of-the-art methods. Additionally, we implement a real-time application to
showcase our method in real-world scenarios. MOVIN dataset is available at
\url{https://movin3d.github.io/movin_pg2023/}.
Related papers
- Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories [28.701879490459675]
We aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within same domain.
We exploit intrinsic regularization provided by SIREN, and modify the input layer to produce atemporally smooth motion field.
Our experiments assess the model's performance in predicting unseen point trajectories and its application in temporal mesh alignment with deformation.
arXiv Detail & Related papers (2024-06-05T21:02:10Z) - Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction [13.422686350235615]
We aim to measure the impact on the reconstruction of the articulated self-avatar's full-body pose.
We analyze the motion reconstruction errors using ground truth and 3D Cartesian coordinates estimated from textitYOLOv8 pose estimation.
arXiv Detail & Related papers (2024-04-29T12:02:06Z) - WANDR: Intention-guided Human Motion Generation [67.07028110459787]
We introduce WANDR, a data-driven model that takes an avatar's initial pose and a goal's 3D position and generates natural human motions that place the end effector (wrist) on the goal location.
Intention guides the agent to the goal, and interactively adapts the generation to novel situations without needing to define sub-goals or the entire motion path.
We evaluate our method extensively and demonstrate its ability to generate natural and long-term motions that reach 3D goals and to unseen goal locations.
arXiv Detail & Related papers (2024-04-23T10:20:17Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Realistic Full-Body Tracking from Sparse Observations via Joint-Level
Modeling [13.284947022380404]
We propose a two-stage framework that can obtain accurate and smooth full-body motions with three tracking signals of head and hands only.
Our framework explicitly models the joint-level features in the first stage and utilizes them astemporal tokens for alternating spatial and temporal transformer blocks to capture joint-level correlations in the second stage.
With extensive experiments on the AMASS motion dataset and real-captured data, we show our proposed method can achieve more accurate and smooth motion compared to existing approaches.
arXiv Detail & Related papers (2023-08-17T08:27:55Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - UmeTrack: Unified multi-view end-to-end hand tracking for VR [34.352638006495326]
Real-time tracking of 3D hand pose in world space is a challenging problem and plays an important role in VR interaction.
We present a unified end-to-end differentiable framework for multi-view, multi-frame hand tracking that directly predicts 3D hand pose in world space.
arXiv Detail & Related papers (2022-10-31T19:09:21Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence.
Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence.
We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z) - Improving Robustness and Accuracy via Relative Information Encoding in
3D Human Pose Estimation [59.94032196768748]
We propose a relative information encoding method that yields positional and temporal enhanced representations.
Our method outperforms state-of-the-art methods on two public datasets.
arXiv Detail & Related papers (2021-07-29T14:12:19Z) - Robust Motion In-betweening [17.473287573543065]
We present a novel, robust transition generation technique that can serve as a new tool for 3D animators.
The system synthesizes high-quality motions that use temporally-sparsers as animation constraints.
We present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios.
arXiv Detail & Related papers (2021-02-09T16:52:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.