End-to-End Motion Capture from Rigid Body Markers with Geodesic Loss
- URL: http://arxiv.org/abs/2511.16418v1
- Date: Thu, 20 Nov 2025 14:43:05 GMT
- Title: End-to-End Motion Capture from Rigid Body Markers with Geodesic Loss
- Authors: Hai Lan, Zongyan Li, Jianmin Hu, Jialing Yang, Houde Dai,
- Abstract summary: Marker-based optical motion capture (MoCap) faces practical challenges, such as time-consuming preparation and marker identification ambiguity.<n>We introduce a novel fundamental unit for MoCap, the Rigid Body Marker (RBM), which provides unambiguous 6-DoF data.<n>We develop a deep-learning-based regression model that directly estimates SMPL parameters under a geodesic loss.
- Score: 3.8338194488710453
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Marker-based optical motion capture (MoCap), while long regarded as the gold standard for accuracy, faces practical challenges, such as time-consuming preparation and marker identification ambiguity, due to its reliance on dense marker configurations, which fundamentally limit its scalability. To address this, we introduce a novel fundamental unit for MoCap, the Rigid Body Marker (RBM), which provides unambiguous 6-DoF data and drastically simplifies setup. Leveraging this new data modality, we develop a deep-learning-based regression model that directly estimates SMPL parameters under a geodesic loss. This end-to-end approach matches the performance of optimization-based methods while requiring over an order of magnitude less computation. Trained on synthesized data from the AMASS dataset, our end-to-end model achieves state-of-the-art accuracy in body pose estimation. Real-world data captured using a Vicon optical tracking system further demonstrates the practical viability of our approach. Overall, the results show that combining sparse 6-DoF RBM with a manifold-aware geodesic loss yields a practical and high-fidelity solution for real-time MoCap in graphics, virtual reality, and biomechanics.
Related papers
- GEM+: Scalable State-of-the-Art Private Synthetic Data with Generator Networks [9.432150710329607]
We introduce GEM+, which integrates AIM's adaptive measurement framework with GEM's scalable generator network.<n>Our experiments show that GEM+ outperforms AIM in both utility and scalability, delivering state-of-the-art results.
arXiv Detail & Related papers (2025-11-12T19:18:43Z) - MEG-GPT: A transformer-based foundation model for magnetoencephalography data [6.336623115095147]
Recent advances in deep learning have enabled significant progress in other domains, such as language and vision, by using foundation models at scale.<n>Here, we introduce MEG-GPT, a transformer based foundation model that uses time-attention and next time-point prediction.<n>We trained MEG-GPT on tokenised brain region time-courses extracted from a large-scale MEG dataset.
arXiv Detail & Related papers (2025-10-20T20:18:38Z) - Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z) - DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion [28.146811420532455]
We introduce DVLO4D, a novel visual-LiDAR odometry framework that leverages sparse spatial-temporal fusion to enhance accuracy and robustness.<n>Our method has high efficiency, with an inference time of 82 ms, possessing the potential for the real-time deployment.
arXiv Detail & Related papers (2025-09-07T11:43:11Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - LSM-2: Learning from Incomplete Wearable Sensor Data [65.58595667477505]
This paper introduces the second generation of Large Sensor Model (LSM-2) with Adaptive and Inherited Masking (AIM)<n>AIM learns robust representations directly from incomplete data without requiring explicit imputation.<n>Our LSM-2 with AIM achieves the best performance across a diverse range of tasks, including classification, regression and generative modeling.
arXiv Detail & Related papers (2025-06-05T17:57:11Z) - RefiDiff: Refinement-Aware Diffusion for Efficient Missing Data Imputation [13.401822039640297]
Missing values in high-dimensional, mixed-type datasets pose significant challenges for data imputation.<n>We propose an innovative framework, RefiDiff, combining local machine learning predictions with a novel Mamba-based denoising network.<n>RefiDiff outperforms state-the-art (SOTA) methods across missing-value settings with a 4x faster training time than DDPM-based approaches.
arXiv Detail & Related papers (2025-05-20T14:51:07Z) - Sparse identification of nonlinear dynamics and Koopman operators with Shallow Recurrent Decoder Networks [3.1484174280822845]
We present a method to jointly solve the sensing and model identification problems with simple implementation, efficient, and robust performance.<n>SINDy-SHRED uses Gated Recurrent Units to model sparse sensor measurements along with a shallow network decoder to reconstruct the full-temporal field from the latent state space.<n>We conduct systematic experimental studies on PDE data such as turbulent flows, real-world sensor measurements for sea surface temperature, and direct video data.
arXiv Detail & Related papers (2025-01-23T02:18:13Z) - AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from
Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets.
Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors.
Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - Learning representations with end-to-end models for improved remaining
useful life prognostics [64.80885001058572]
The remaining Useful Life (RUL) of equipment is defined as the duration between the current time and its failure.
We propose an end-to-end deep learning model based on multi-layer perceptron and long short-term memory layers (LSTM) to predict the RUL.
We will discuss how the proposed end-to-end model is able to achieve such good results and compare it to other deep learning and state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T16:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.