Related papers: Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening

Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening

URL: http://arxiv.org/abs/2509.00872v1
Date: Sun, 31 Aug 2025 14:34:11 GMT
Title: Pose as Clinical Prior: Learning Dual Representations for Scoliosis Screening
Authors: Zirui Zhou, Zizhao Peng, Dongyang Jin, Chao Fan, Fengwei An, Shiqi Yu,
Abstract summary: Scoliosis1K-Pose is a 2D human pose annotation set that extends the original Scoliosis1K dataset.<n>We introduce the Dual Representation Framework (DRF), which integrates a continuous skeleton map with a discrete Postural Asymmetry Vector (PAV)<n>A novel PAV-Guided Attention (PGA) module further uses the PAV as clinical prior to direct feature extraction.
Score: 15.000310203998012
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent AI-based scoliosis screening methods primarily rely on large-scale silhouette datasets, often neglecting clinically relevant postural asymmetries-key indicators in traditional screening. In contrast, pose data provide an intuitive skeletal representation, enhancing clinical interpretability across various medical applications. However, pose-based scoliosis screening remains underexplored due to two main challenges: (1) the scarcity of large-scale, annotated pose datasets; and (2) the discrete and noise-sensitive nature of raw pose coordinates, which hinders the modeling of subtle asymmetries. To address these limitations, we introduce Scoliosis1K-Pose, a 2D human pose annotation set that extends the original Scoliosis1K dataset, comprising 447,900 frames of 2D keypoints from 1,050 adolescents. Building on this dataset, we introduce the Dual Representation Framework (DRF), which integrates a continuous skeleton map to preserve spatial structure with a discrete Postural Asymmetry Vector (PAV) that encodes clinically relevant asymmetry descriptors. A novel PAV-Guided Attention (PGA) module further uses the PAV as clinical prior to direct feature extraction from the skeleton map, focusing on clinically meaningful asymmetries. Extensive experiments demonstrate that DRF achieves state-of-the-art performance. Visualizations further confirm that the model leverages clinical asymmetry cues to guide feature extraction and promote synergy between its dual representations. The dataset and code are publicly available at https://zhouzi180.github.io/Scoliosis1K/.

Related papers

Preoperative-to-intraoperative Liver Registration for Laparoscopic Surgery via Latent-Grounded Correspondence Constraints [51.7011449975586]
Land-Reg is a deformable registration framework that learns latent-grounded 2D-3D landmark correspondences.<n>For rigid registration, Land-Reg embraces a Cross-modal Latent Alignment module.<n>An Uncertainty-enhanced Overlap Landmark Detector with similarity matching is proposed to robustly estimate explicit 2D-3D landmark correspondences.
arXiv Detail & Related papers (2026-03-02T10:44:03Z)
Context-Aware Asymmetric Ensembling for Interpretable Retinopathy of Prematurity Screening via Active Query and Vascular Attention [1.8420107091891775]
Retinopathy of Prematurity (ROP) is among the major causes of preventable childhood blindness.<n>Current deep learning models depend heavily on large private datasets and passive multimodal fusion.<n>We propose the Context-Aware Asymmetric Ensemble Model (CAA Ensemble) that simulates clinical reasoning through two specialized streams.
arXiv Detail & Related papers (2026-02-05T02:06:26Z)
A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z)
Silhouette-to-Contour Registration: Aligning Intraoral Scan Models with Cephalometric Radiographs [10.70146635420186]
We propose DentalSCR, a pose-stable, contour-guided framework for accurate and interpretable silhouette-to-contour registration.<n>We evaluate DentalSCR on 34 expert-annotated clinical cases.
arXiv Detail & Related papers (2025-11-18T10:50:04Z)
Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation [61.350584471060756]
Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images.<n>We propose Self-Supervised Anatomical Consistency Learning (SS-ACL) to align generated reports with corresponding anatomical regions.<n>SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy.
arXiv Detail & Related papers (2025-09-30T08:59:06Z)
Semantic Segmentation for Preoperative Planning in Transcatheter Aortic Valve Replacement [61.573750959726475]
We consider medical guidelines for preoperative planning of the transcatheter aortic valve replacement (TAVR) and identify tasks that may be supported via semantic segmentation models.<n>We first derive fine-grained TAVR-relevant pseudo-labels from coarse-grained anatomical information, in order to train segmentation models and quantify how well they are able to find these structures in the scans.
arXiv Detail & Related papers (2025-07-22T13:24:45Z)
Towards Unconstrained 2D Pose Estimation of the Human Spine [12.131745767490298]
SpineTrack is the first comprehensive dataset for 2D spine pose estimation in unconstrained settings.<n>We introduce SpinePose, extending state-of-the-art body pose estimators using knowledge distillation and an anatomical regularization strategy to jointly predict body and spine keypoints.
arXiv Detail & Related papers (2025-04-10T20:11:02Z)
Geo-UNet: A Geometrically Constrained Neural Framework for Clinical-Grade Lumen Segmentation in Intravascular Ultrasound [7.760705377465734]
Current segmentation networks like the UNet lack the precision needed for clinical adoption in IVUS. We propose the Geo-UNet framework to address these issues via a design informed by the geometry of the segmentation task. The efficacy of our framework on a venous IVUS dataset is shown against state-of-the-art models.
arXiv Detail & Related papers (2024-08-09T02:55:25Z)
Class-Aware Cartilage Segmentation for Autonomous US-CT Registration in Robotic Intercostal Ultrasound Imaging [39.597735935731386]
A class-aware cartilage bone segmentation network with geometry-constraint post-processing is presented to capture patient-specific rib skeletons. A dense skeleton graph-based non-rigid registration is presented to map the intercostal scanning path from a generic template to individual patients. Results demonstrate that the proposed graph-based registration method can robustly and precisely map the path from CT template to individual patients.
arXiv Detail & Related papers (2024-06-06T14:15:15Z)
Thoracic Cartilage Ultrasound-CT Registration using Dense Skeleton Graph [49.11220791279602]
It is challenging to accurately map planned paths from a generic atlas to individual patients, particularly for thoracic applications. A graph-based non-rigid registration is proposed to enable transferring planned paths from the atlas to the current setup.
arXiv Detail & Related papers (2023-07-07T18:57:21Z)
Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually. Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data. We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas. This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z)
Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling [66.75096111651062]
We created a large-scale dataset of 10,021 thoracic CTs with 157 labels. We applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. Our resulting segmentation models demonstrated remarkable performance on CXR.
arXiv Detail & Related papers (2023-06-06T18:01:08Z)
Multi-structure bone segmentation in pediatric MR images with combined regularization from shape priors and adversarial network [0.4588028371034407]
We propose a new pre-trained regularized convolutional encoder-decoder network for the challenging task of segmenting heterogeneous pediatric magnetic resonance (MR) images. In order to obtain globally consistent predictions, we incorporate a shape priors based regularization, derived from a non-linear shape representation learnt by an auto-encoder. The proposed method performed either better or at par with previously proposed approaches for Dice, sensitivity, specificity, maximum symmetric surface distance, average symmetric surface distance, and relative absolute volume difference metrics.
arXiv Detail & Related papers (2020-09-15T13:39:53Z)
Anatomy-Aware Siamese Network: Exploiting Semantic Asymmetry for Accurate Pelvic Fracture Detection in X-ray Images [36.35987775099686]
We propose a novel fracture detection framework that builds upon a Siamese network enhanced with a spatial transformer layer. Our proposed method have been extensively evaluated on 2,359 PXRs from unique patients. This is the highest among state-of-the-art fracture detection methods, with improved clinical indications.
arXiv Detail & Related papers (2020-07-03T02:33:24Z)
Appearance Learning for Image-based Motion Estimation in Tomography [60.980769164955454]
In tomographic imaging, anatomical structures are reconstructed by applying a pseudo-inverse forward model to acquired signals. Patient motion corrupts the geometry alignment in the reconstruction process resulting in motion artifacts. We propose an appearance learning approach recognizing the structures of rigid motion independently from the scanned object.
arXiv Detail & Related papers (2020-06-18T09:49:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.