Related papers: Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss

Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss

URL: http://arxiv.org/abs/2511.21575v1
Date: Wed, 26 Nov 2025 16:50:06 GMT
Title: Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss
Authors: Chou Mo, Yehyun Suh, J. Ryan Martin, Daniel Moyer,
Abstract summary: We propose a novel framework that incorporates 2D/3D landmark registration into the training of a U-Net landmark prediction model.<n>We analyze the performance difference by comparing landmark detection accuracy between the baseline U-Net, U-Net trained with Pose Estimation Loss, and U-Net fine-tuned with Pose Estimation Loss under realistic intra-operative conditions.
Score: 1.6420503030062876
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated landmark detection offers an efficient approach for medical professionals to understand patient anatomic structure and positioning using intra-operative imaging. While current detection methods for pelvic fluoroscopy demonstrate promising accuracy, most assume a fixed Antero-Posterior view of the pelvis. However, orientation often deviates from this standard view, either due to repositioning of the imaging unit or of the target structure itself. To address this limitation, we propose a novel framework that incorporates 2D/3D landmark registration into the training of a U-Net landmark prediction model. We analyze the performance difference by comparing landmark detection accuracy between the baseline U-Net, U-Net trained with Pose Estimation Loss, and U-Net fine-tuned with Pose Estimation Loss under realistic intra-operative conditions where patient pose is variable.

Related papers

Preoperative-to-intraoperative Liver Registration for Laparoscopic Surgery via Latent-Grounded Correspondence Constraints [51.7011449975586]
Land-Reg is a deformable registration framework that learns latent-grounded 2D-3D landmark correspondences.<n>For rigid registration, Land-Reg embraces a Cross-modal Latent Alignment module.<n>An Uncertainty-enhanced Overlap Landmark Detector with similarity matching is proposed to robustly estimate explicit 2D-3D landmark correspondences.
arXiv Detail & Related papers (2026-03-02T10:44:03Z)
BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation [6.915058920280426]
We propose a vision-based pose optimization framework for 2D-3D registration between intra-operative endoscopic views and pre-operative CT anatomy.<n>A fine-tuned modality- and domain-invariant encoder enables direct similarity between real endoscopic RGB frames and CT-rendered depth maps.<n>Our model achieves an average translational error of 2.65 mm and a rotational error of 0.19 rad, demonstrating accurate and stable localization.
arXiv Detail & Related papers (2025-11-12T15:58:05Z)
EndoSfM3D: Learning to 3D Reconstruct Any Endoscopic Surgery Scene using Self-supervised Foundation Model [2.8913847481700667]
3D reconstruction of endoscopic surgery scenes plays a vital role in enhancing scene perception, enabling AR visualization, and supporting context-aware decision-making in image-guided surgery.<n>In intrinsic calibration is hindered by sterility constraints and the use of specialized endoscopes with continuous zoom and telescope rotation.<n>In this paper, we integrate intrinsic parameter estimation into a self-supervised monocular depth estimation framework by adapting the Depth Anything V2 (DA2) model for joint depth, pose, and intrinsics prediction.<n>Our method is validated on the SCARED and C3VD public datasets, demonstrating superior performance compared to recent state-of-the
arXiv Detail & Related papers (2025-10-25T16:39:04Z)
Landmark-Free Preoperative-to-Intraoperative Registration in Laparoscopic Liver Resection [50.388465935739376]
Liver registration by overlaying preoperative 3D models onto intraoperative 2D frames can assist surgeons in perceiving the spatial anatomy of the liver clearly for a higher surgical success rate.<n>Existing registration methods rely heavily on anatomical landmark-based, which encounter two major limitations.<n>We propose a landmark-free preoperative-to-intraoperative registration framework utilizing effective self-supervised learning.
arXiv Detail & Related papers (2025-04-21T14:55:57Z)
Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z)
Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering [5.617649111108429]
We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data. DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines.
arXiv Detail & Related papers (2023-12-11T13:05:54Z)
Robust Landmark-based Stent Tracking in X-ray Fluoroscopy [10.917460255497227]
We propose an end-to-end deep learning framework for single stent tracking. It consists of three hierarchical modules: U-Net based landmark detection, ResNet based stent proposal and feature extraction. Experiments show that our method performs significantly better in detection compared with the state-of-the-art point-based tracking models.
arXiv Detail & Related papers (2022-07-20T14:20:03Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection [76.01333073259677]
We propose the use of Space-aware Memory Queues for In-painting and Detecting anomalies from radiography images (abbreviated as SQUID) We show that SQUID can taxonomize the ingrained anatomical structures into recurrent patterns; and in the inference, it can identify anomalies (unseen/modified patterns) in the image.
arXiv Detail & Related papers (2021-11-26T13:47:34Z)
Tattoo tomography: Freehand 3D photoacoustic image reconstruction with an optical pattern [49.240017254888336]
Photoacoustic tomography (PAT) is a novel imaging technique that can resolve both morphological and functional tissue properties. A current drawback is the limited field-of-view provided by the conventionally applied 2D probes. We present a novel approach to 3D reconstruction of PAT data that does not require an external tracking system.
arXiv Detail & Related papers (2020-11-10T09:27:56Z)
Appearance Learning for Image-based Motion Estimation in Tomography [60.980769164955454]
In tomographic imaging, anatomical structures are reconstructed by applying a pseudo-inverse forward model to acquired signals. Patient motion corrupts the geometry alignment in the reconstruction process resulting in motion artifacts. We propose an appearance learning approach recognizing the structures of rigid motion independently from the scanned object.
arXiv Detail & Related papers (2020-06-18T09:49:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.