Related papers: Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration

Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration

URL: http://arxiv.org/abs/2206.08903v3
Date: Tue, 5 Sep 2023 17:51:32 GMT
Title: Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration
Authors: Taylor L. Bobrow, Mayank Golhar, Rohan Vijayan, Venkata S. Akshintala, Juan R. Garcia, and Nicholas J. Durr
Abstract summary: We present a Colonoscopy 3D Video dataset (C3VD) for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model.
Score: 1.1774995069145182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at durr.jhu.edu/C3VD.

Related papers

C3VDv2 -- Colonoscopy 3D video dataset with enhanced realism [1.1774995069145182]
This paper introduces C3VDv2, the second version (v2) of the high-definition Colonoscopy 3D Video dataset.<n>192 video sequences were captured by imaging 60 unique, high-fidelity silicone colon phantom segments.<n>Eight simulated screening colonoscopy videos acquired by a gastroenterologist are provided with ground truth poses.<n>The dataset includes 15 videos featuring colon deformations for qualitative assessment.
arXiv Detail & Related papers (2025-06-30T17:29:06Z)
Acquiring Submillimeter-Accurate Multi-Task Vision Datasets for Computer-Assisted Orthopedic Surgery [0.9268994664916388]
We generate realistic and accurate ex vivo datasets tailored for 3D reconstruction and feature matching in open orthopedic surgery. A mean 3D Euclidean error of 0.35 mm is achieved with respect to the 3D ground truth. This opens the door to acquiring future surgical datasets for high-precision applications.
arXiv Detail & Related papers (2025-01-26T02:52:46Z)
UVRM: A Scalable 3D Reconstruction Model from Unposed Videos [68.34221167200259]
Training 3D reconstruction models with 2D visual data traditionally requires prior knowledge of camera poses for the training samples. We introduce UVRM, a novel 3D reconstruction model capable of being trained and evaluated on monocular videos without requiring any information about the pose.
arXiv Detail & Related papers (2025-01-16T08:00:17Z)
MedTet: An Online Motion Model for 4D Heart Reconstruction [59.74234226055964]
We present a novel approach to reconstruction of 3D cardiac motion from sparse intraoperative data. Existing methods can accurately reconstruct 3D organ geometries from full 3D volumetric imaging. We propose a versatile framework for reconstructing 3D motion from such partial data.
arXiv Detail & Related papers (2024-12-03T17:18:33Z)
Rigid Single-Slice-in-Volume registration via rotation-equivariant 2D/3D feature matching [3.041742847777409]
We propose a self-supervised 2D/3D registration approach to match a single 2D slice to the corresponding 3D volume. Results demonstrate the robustness of the proposed slice-in-volume registration on the NSCLC-Radiomics CT and KIRBY21 MRI datasets.
arXiv Detail & Related papers (2024-10-24T12:24:27Z)
SALVE: A 3D Reconstruction Benchmark of Wounds from Consumer-grade Videos [20.69257610322339]
This paper presents a study on 3D wound reconstruction from consumer-grade videos. We introduce the SALVE dataset, comprising video recordings of realistic wound phantoms captured with different cameras. We assess the accuracy and precision of state-of-the-art methods for 3D reconstruction, ranging from traditional photogrammetry pipelines to advanced neural rendering approaches.
arXiv Detail & Related papers (2024-07-29T02:34:51Z)
Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection. Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor. The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z)
Generative Enhancement for 3D Medical Images [74.17066529847546]
We propose GEM-3D, a novel generative approach to the synthesis of 3D medical images. Our method begins with a 2D slice, noted as the informed slice to serve the patient prior, and propagates the generation process using a 3D segmentation mask. By decomposing the 3D medical images into masks and patient prior information, GEM-3D offers a flexible yet effective solution for generating versatile 3D images.
arXiv Detail & Related papers (2024-03-19T15:57:04Z)
Domain adaptation strategies for 3D reconstruction of the lumbar spine using real fluoroscopy data [9.21828361691977]
This study tackles key obstacles in adopting surgical navigation in orthopedic surgeries. It shows an approach for generating 3D anatomical models of the spine from only a few fluoroscopic images. It achieved an 84% F1 score, matching the accuracy of our previous synthetic data-based research.
arXiv Detail & Related papers (2024-01-29T10:22:45Z)
On the Localization of Ultrasound Image Slices within Point Distribution Models [84.27083443424408]
Thyroid disorders are most commonly diagnosed using high-resolution Ultrasound (US) Longitudinal tracking is a pivotal diagnostic protocol for monitoring changes in pathological thyroid morphology. We present a framework for automated US image slice localization within a 3D shape representation.
arXiv Detail & Related papers (2023-09-01T10:10:46Z)
Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z)
Oral-3Dv2: 3D Oral Reconstruction from Panoramic X-Ray Imaging with Implicit Neural Representation [3.8215162658168524]
Oral-3Dv2 is a non-adversarial-learning-based model in 3D radiology reconstruction from a single panoramic X-ray image. Our model learns to represent the 3D oral structure in an implicit way by mapping 2D coordinates into density values of voxels in the 3D space. To the best of our knowledge, this is the first work of a non-adversarial-learning-based model in 3D radiology reconstruction from a single panoramic X-ray image.
arXiv Detail & Related papers (2023-03-21T18:17:27Z)
CNN-based real-time 2D-3D deformable registration from a single X-ray projection [2.1198879079315573]
This paper presents a method for real-time 2D-3D non-rigid registration using a single fluoroscopic image. A dataset composed of displacement fields and 2D projections of the anatomy is generated from a preoperative scan. A neural network is trained to recover the unknown 3D displacement field from a single projection image.
arXiv Detail & Related papers (2022-12-15T09:57:19Z)
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification. We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z)
Revisiting 3D Context Modeling with Supervised Pre-training for Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices. With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset. The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.