Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries
- URL: http://arxiv.org/abs/2405.11677v1
- Date: Sun, 19 May 2024 21:35:12 GMT
- Title: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries
- Authors: Christiaan G. A. Viviers, Lena Filatova, Maurice Termeer, Peter H. N. de With, Fons van der Sommen,
- Abstract summary: We propose a general-purpose approach of data acquisition for 6-DoF pose estimation tasks in X-ray systems.
The proposed YOLOv5-6D pose model achieves competitive results on public benchmarks whilst being considerably faster at 42 FPS on GPU.
The model achieves a 92.41% by the 0.1 ADD-S metric, demonstrating a promising approach for enhancing surgical precision and patient outcomes.
- Score: 7.630289691590948
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Accurate 6-DoF pose estimation of surgical instruments during minimally invasive surgeries can substantially improve treatment strategies and eventual surgical outcome. Existing deep learning methods have achieved accurate results, but they require custom approaches for each object and laborious setup and training environments often stretching to extensive simulations, whilst lacking real-time computation. We propose a general-purpose approach of data acquisition for 6-DoF pose estimation tasks in X-ray systems, a novel and general purpose YOLOv5-6D pose architecture for accurate and fast object pose estimation and a complete method for surgical screw pose estimation under acquisition geometry consideration from a monocular cone-beam X-ray image. The proposed YOLOv5-6D pose model achieves competitive results on public benchmarks whilst being considerably faster at 42 FPS on GPU. In addition, the method generalizes across varying X-ray acquisition geometry and semantic image complexity to enable accurate pose estimation over different domains. Finally, the proposed approach is tested for bone-screw pose estimation for computer-aided guidance during spine surgeries. The model achieves a 92.41% by the 0.1 ADD-S metric, demonstrating a promising approach for enhancing surgical precision and patient outcomes. The code for YOLOv5-6D is publicly available at https://github.com/cviviers/YOLOv5-6D-Pose
Related papers
- Realistic Data Generation for 6D Pose Estimation of Surgical Instruments [4.226502078427161]
6D pose estimation of surgical instruments is critical to enable the automatic execution of surgical maneuvers.
In household and industrial settings, synthetic data, generated with 3D computer graphics software, has been shown as an alternative to minimize annotation costs.
We propose an improved simulation environment for surgical robotics that enables the automatic generation of large and diverse datasets.
arXiv Detail & Related papers (2024-06-11T14:59:29Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Redefining the Laparoscopic Spatial Sense: AI-based Intra- and
Postoperative Measurement from Stereoimages [3.2039076408339353]
We develop a novel human-AI-based method for laparoscopic measurements utilizing stereo vision.
Based on a holistic qualitative requirements analysis, this work proposes a comprehensive measurement method.
Our results outline the potential of our method achieving high accuracies in distance measurements with errors below 1 mm.
arXiv Detail & Related papers (2023-11-16T10:19:04Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Towards real-time 6D pose estimation of objects in single-view cone-beam
X-ray [6.971105483667455]
6D Object estimation based on deep learning models for X-ray images often use custom geometries that employ extensive CAD models and simulated data for training purposes.
Recent RGB-based methods opt to solve estimation problems using small datasets, making them more attractive for the X-ray domain where medical data is scarcely available.
We refine an existing RGB-based model (SingleShot) to estimate the 6D pose of a marked cube from X-ray images by creating a generic solution trained on only real X-ray data adjusted for X-ray geometry.
arXiv Detail & Related papers (2022-11-06T20:06:28Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - HMD-EgoPose: Head-Mounted Display-Based Egocentric Marker-Less Tool and
Hand Pose Estimation for Augmented Surgical Guidance [0.0]
We present HMD-EgoPose, a single-shot learning-based approach to hand and object pose estimation.
We demonstrate state-of-the-art performance on a benchmark dataset for marker-less hand and surgical instrument pose tracking.
arXiv Detail & Related papers (2022-02-24T04:07:34Z) - A Self-Supervised Deep Framework for Reference Bony Shape Estimation in
Orthognathic Surgical Planning [55.30223654196882]
A virtual orthognathic surgical planning involves simulating surgical corrections of jaw deformities on 3D facial bony shape models.
A reference facial bony shape model representing normal anatomies can provide an objective guidance to improve planning accuracy.
We propose a self-supervised deep framework to automatically estimate reference facial bony shape models.
arXiv Detail & Related papers (2021-09-11T05:24:40Z) - Appearance Learning for Image-based Motion Estimation in Tomography [60.980769164955454]
In tomographic imaging, anatomical structures are reconstructed by applying a pseudo-inverse forward model to acquired signals.
Patient motion corrupts the geometry alignment in the reconstruction process resulting in motion artifacts.
We propose an appearance learning approach recognizing the structures of rigid motion independently from the scanned object.
arXiv Detail & Related papers (2020-06-18T09:49:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.