Towards Unconstrained 2D Pose Estimation of the Human Spine
- URL: http://arxiv.org/abs/2504.08110v1
- Date: Thu, 10 Apr 2025 20:11:02 GMT
- Title: Towards Unconstrained 2D Pose Estimation of the Human Spine
- Authors: Muhammad Saif Ullah Khan, Stephan Krauß, Didier Stricker,
- Abstract summary: SpineTrack is the first comprehensive dataset for 2D spine pose estimation in unconstrained settings.<n>We introduce SpinePose, extending state-of-the-art body pose estimators using knowledge distillation and an anatomical regularization strategy to jointly predict body and spine keypoints.
- Score: 12.131745767490298
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present SpineTrack, the first comprehensive dataset for 2D spine pose estimation in unconstrained settings, addressing a crucial need in sports analytics, healthcare, and realistic animation. Existing pose datasets often simplify the spine to a single rigid segment, overlooking the nuanced articulation required for accurate motion analysis. In contrast, SpineTrack annotates nine detailed spinal keypoints across two complementary subsets: a synthetic set comprising 25k annotations created using Unreal Engine with biomechanical alignment through OpenSim, and a real-world set comprising over 33k annotations curated via an active learning pipeline that iteratively refines automated annotations with human feedback. This integrated approach ensures anatomically consistent labels at scale, even for challenging, in-the-wild images. We further introduce SpinePose, extending state-of-the-art body pose estimators using knowledge distillation and an anatomical regularization strategy to jointly predict body and spine keypoints. Our experiments in both general and sports-specific contexts validate the effectiveness of SpineTrack for precise spine pose estimation, establishing a robust foundation for future research in advanced biomechanical analysis and 3D spine reconstruction in the wild.
Related papers
- Preoperative-to-intraoperative Liver Registration for Laparoscopic Surgery via Latent-Grounded Correspondence Constraints [51.7011449975586]
Land-Reg is a deformable registration framework that learns latent-grounded 2D-3D landmark correspondences.<n>For rigid registration, Land-Reg embraces a Cross-modal Latent Alignment module.<n>An Uncertainty-enhanced Overlap Landmark Detector with similarity matching is proposed to robustly estimate explicit 2D-3D landmark correspondences.
arXiv Detail & Related papers (2026-03-02T10:44:03Z) - SIMSPINE: A Biomechanics-Aware Simulation Framework for 3D Spine Motion Annotation and Benchmarking [19.28827026574636]
We present a biomechanics-aware keypoint simulation framework that augments human pose datasets with anatomically consistent 3D spinal keypoints.<n>We create the first open dataset, named SIMSPINE, which provides sparse vertebra-level 3D spinal annotations for natural full-body motions.<n>With 2.14 million frames, this enables data-driven learning of vertebral kinematics from subtle posture variations.
arXiv Detail & Related papers (2026-02-24T11:31:20Z) - SLD: Segmentation-Based Landmark Detection for Spinal Ligaments [0.20999222360659606]
In biomechanical modeling, the representation of ligament attachments is crucial for a realistic simulation of the forces acting between the vertebrae.<n>This work presents a novel approach for detecting spinal ligament landmarks, which first performs shape-based segmentation of 3D vertebrae.<n>The proposed method outperforms existing approaches by achieving high accuracy and demonstrating strong generalization across all spinal regions.
arXiv Detail & Related papers (2026-01-23T14:29:44Z) - Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening [15.000310203998012]
Scoliosis1K-Pose is a 2D human pose annotation set that extends the original Scoliosis1K dataset.<n>We introduce the Dual Representation Framework (DRF), which integrates a continuous skeleton map with a discrete Postural Asymmetry Vector (PAV)<n>A novel PAV-Guided Attention (PGA) module further uses the PAV as clinical prior to direct feature extraction.
arXiv Detail & Related papers (2025-08-31T14:34:11Z) - Best Foot Forward: Robust Foot Reconstruction in-the-wild [2.059210052546126]
We present a novel end-to-end pipeline that refines Structure-from-Motion (SfM) reconstruction.<n>It first resolves scan alignment ambiguities using SE(3) canonicalization with a viewpoint prediction module, then completes missing geometry through an attention-based network trained on synthetically augmented point clouds.<n>Our approach achieves state-of-the-art performance on reconstruction metrics while preserving clinically validated anatomical fidelity.
arXiv Detail & Related papers (2025-02-27T20:40:20Z) - Spinal ligaments detection on vertebrae meshes using registration and 3D edge detection [0.4194295877935868]
The proposed method is able to detect 66 spinal ligament attachment points by using a step-wise approach.<n>The landmark detection requires approximately 3.0 seconds per vertebra, providing a substantial improvement over existing methods.
arXiv Detail & Related papers (2024-12-06T14:39:06Z) - Reconstruction of 3D lumbar spine models from incomplete segmentations using landmark detection [0.4194295877935868]
We present a novel method to reconstruct complete 3D lumbar spine models from incomplete 3D vertebral bodies.<n>Our method achieves the registration of the entire lumbar spine, spanning segments L1 to L5, in just 0.14 seconds.
arXiv Detail & Related papers (2024-12-06T14:23:42Z) - 3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information [2.457872341625575]
A novel Semantic Graph Attention Network can benefit from the ability of self-attention to capture global context.
A Body Part Decoder assists in extracting and refining the information related to specific segments of the body.
A Geometry Loss makes a critical constraint on the structural skeleton of the body, ensuring that the model's predictions adhere to the natural limits of human posture.
arXiv Detail & Related papers (2024-06-03T10:59:00Z) - SkelFormer: Markerless 3D Pose and Shape Estimation using Skeletal Transformers [57.46911575980854]
We introduce SkelFormer, a novel markerless motion capture pipeline for multi-view human pose and shape estimation.
Our method first uses off-the-shelf 2D keypoint estimators, pre-trained on large-scale in-the-wild data, to obtain 3D joint positions.
Next, we design a regression-based inverse-kinematic skeletal transformer that maps the joint positions to pose and shape representations from heavily noisy observations.
arXiv Detail & Related papers (2024-04-19T04:51:18Z) - 3D Kinematics Estimation from Video with a Biomechanical Model and
Synthetic Training Data [4.130944152992895]
We propose a novel biomechanics-aware network that directly outputs 3D kinematics from two input views.
Our experiments demonstrate that the proposed approach, only trained on synthetic data, outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2024-02-20T17:33:40Z) - Vogtareuth Rehab Depth Datasets: Benchmark for Marker-less Posture
Estimation in Rehabilitation [55.41644538483948]
We propose two rehabilitation-specific pose datasets containing depth images and 2D pose information of patients performing rehab exercises.
We use a state-of-the-art marker-less posture estimation model which is trained on a non-rehab benchmark dataset.
We show that our dataset can be used to train pose models to detect rehab-specific complex postures.
arXiv Detail & Related papers (2021-08-23T16:18:26Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - MotioNet: 3D Human Motion Reconstruction from Monocular Video with
Skeleton Consistency [72.82534577726334]
We introduce MotioNet, a deep neural network that directly reconstructs the motion of a 3D human skeleton from monocular video.
Our method is the first data-driven approach that directly outputs a kinematic skeleton, which is a complete, commonly used, motion representation.
arXiv Detail & Related papers (2020-06-22T08:50:09Z) - Appearance Learning for Image-based Motion Estimation in Tomography [60.980769164955454]
In tomographic imaging, anatomical structures are reconstructed by applying a pseudo-inverse forward model to acquired signals.
Patient motion corrupts the geometry alignment in the reconstruction process resulting in motion artifacts.
We propose an appearance learning approach recognizing the structures of rigid motion independently from the scanned object.
arXiv Detail & Related papers (2020-06-18T09:49:11Z) - Anatomy-aware 3D Human Pose Estimation with Bone-based Pose
Decomposition [92.99291528676021]
Instead of directly regressing the 3D joint locations, we decompose the task into bone direction prediction and bone length prediction.
Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time.
Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets.
arXiv Detail & Related papers (2020-02-24T15:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.