Related papers: End-to-End Motion Capture from Rigid Body Markers with Geodesic Loss

End-to-End Motion Capture from Rigid Body Markers with Geodesic Loss

URL: http://arxiv.org/abs/2511.16418v1
Date: Thu, 20 Nov 2025 14:43:05 GMT
Title: End-to-End Motion Capture from Rigid Body Markers with Geodesic Loss
Authors: Hai Lan, Zongyan Li, Jianmin Hu, Jialing Yang, Houde Dai,
Abstract summary: Marker-based optical motion capture (MoCap) faces practical challenges, such as time-consuming preparation and marker identification ambiguity.<n>We introduce a novel fundamental unit for MoCap, the Rigid Body Marker (RBM), which provides unambiguous 6-DoF data.<n>We develop a deep-learning-based regression model that directly estimates SMPL parameters under a geodesic loss.
Score: 3.8338194488710453
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Marker-based optical motion capture (MoCap), while long regarded as the gold standard for accuracy, faces practical challenges, such as time-consuming preparation and marker identification ambiguity, due to its reliance on dense marker configurations, which fundamentally limit its scalability. To address this, we introduce a novel fundamental unit for MoCap, the Rigid Body Marker (RBM), which provides unambiguous 6-DoF data and drastically simplifies setup. Leveraging this new data modality, we develop a deep-learning-based regression model that directly estimates SMPL parameters under a geodesic loss. This end-to-end approach matches the performance of optimization-based methods while requiring over an order of magnitude less computation. Trained on synthesized data from the AMASS dataset, our end-to-end model achieves state-of-the-art accuracy in body pose estimation. Real-world data captured using a Vicon optical tracking system further demonstrates the practical viability of our approach. Overall, the results show that combining sparse 6-DoF RBM with a manifold-aware geodesic loss yields a practical and high-fidelity solution for real-time MoCap in graphics, virtual reality, and biomechanics.

Related papers

GEM+: Scalable State-of-the-Art Private Synthetic Data with Generator Networks [9.432150710329607]
We introduce GEM+, which integrates AIM's adaptive measurement framework with GEM's scalable generator network.<n>Our experiments show that GEM+ outperforms AIM in both utility and scalability, delivering state-of-the-art results.
arXiv Detail & Related papers (2025-11-12T19:18:43Z)
MEG-GPT: A transformer-based foundation model for magnetoencephalography data [6.336623115095147]
Recent advances in deep learning have enabled significant progress in other domains, such as language and vision, by using foundation models at scale.<n>Here, we introduce MEG-GPT, a transformer based foundation model that uses time-attention and next time-point prediction.<n>We trained MEG-GPT on tokenised brain region time-courses extracted from a large-scale MEG dataset.
arXiv Detail & Related papers (2025-10-20T20:18:38Z)
Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z)
DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion [28.146811420532455]
We introduce DVLO4D, a novel visual-LiDAR odometry framework that leverages sparse spatial-temporal fusion to enhance accuracy and robustness.<n>Our method has high efficiency, with an inference time of 82 ms, possessing the potential for the real-time deployment.
arXiv Detail & Related papers (2025-09-07T11:43:11Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
LSM-2: Learning from Incomplete Wearable Sensor Data [65.58595667477505]
This paper introduces the second generation of Large Sensor Model (LSM-2) with Adaptive and Inherited Masking (AIM)<n>AIM learns robust representations directly from incomplete data without requiring explicit imputation.<n>Our LSM-2 with AIM achieves the best performance across a diverse range of tasks, including classification, regression and generative modeling.
arXiv Detail & Related papers (2025-06-05T17:57:11Z)
RefiDiff: Refinement-Aware Diffusion for Efficient Missing Data Imputation [13.401822039640297]
Missing values in high-dimensional, mixed-type datasets pose significant challenges for data imputation.<n>We propose an innovative framework, RefiDiff, combining local machine learning predictions with a novel Mamba-based denoising network.<n>RefiDiff outperforms state-the-art (SOTA) methods across missing-value settings with a 4x faster training time than DDPM-based approaches.
arXiv Detail & Related papers (2025-05-20T14:51:07Z)
Sparse identification of nonlinear dynamics and Koopman operators with Shallow Recurrent Decoder Networks [3.1484174280822845]
We present a method to jointly solve the sensing and model identification problems with simple implementation, efficient, and robust performance.<n>SINDy-SHRED uses Gated Recurrent Units to model sparse sensor measurements along with a shallow network decoder to reconstruct the full-temporal field from the latent state space.<n>We conduct systematic experimental studies on PDE data such as turbulent flows, real-world sensor measurements for sea surface temperature, and direct video data.
arXiv Detail & Related papers (2025-01-23T02:18:13Z)
AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets. Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors. Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z)
Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning. This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z)
Learning representations with end-to-end models for improved remaining useful life prognostics [64.80885001058572]
The remaining Useful Life (RUL) of equipment is defined as the duration between the current time and its failure. We propose an end-to-end deep learning model based on multi-layer perceptron and long short-term memory layers (LSTM) to predict the RUL. We will discuss how the proposed end-to-end model is able to achieve such good results and compare it to other deep learning and state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T16:45:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.