Related papers: RealSynCol: a high-fidelity synthetic colon dataset for 3D reconstruction applications

RealSynCol: a high-fidelity synthetic colon dataset for 3D reconstruction applications

URL: http://arxiv.org/abs/2602.08397v1
Date: Mon, 09 Feb 2026 08:57:37 GMT
Title: RealSynCol: a high-fidelity synthetic colon dataset for 3D reconstruction applications
Authors: Chiara Lena, Davide Milesi, Alessandro Casella, Luca Carlini, Joseph C. Norton, James Martin, Bruno Scaglioni, Keith L. Obstein, Roberto De Sire, Marco Spadaccini, Cesare Hassan, Pietro Valdastri, Elena De Momi,
Abstract summary: We propose RealSynCol, a highly realistic synthetic dataset designed to replicate the endoscopic environment.<n>The resulting dataset comprises 28,130 frames, paired with ground truth depth maps, optical flow, 3D meshes, and camera trajectories.<n>Results demonstrate that the high realism and variability of RealSynCol significantly enhance generalization performance on clinical images.
Score: 33.26682919703966
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Deep learning has the potential to improve colonoscopy by enabling 3D reconstruction of the colon, providing a comprehensive view of mucosal surfaces and lesions, and facilitating the identification of unexplored areas. However, the development of robust methods is limited by the scarcity of large-scale ground truth data. We propose RealSynCol, a highly realistic synthetic dataset designed to replicate the endoscopic environment. Colon geometries extracted from 10 CT scans were imported into a virtual environment that closely mimics intraoperative conditions and rendered with realistic vascular textures. The resulting dataset comprises 28\,130 frames, paired with ground truth depth maps, optical flow, 3D meshes, and camera trajectories. A benchmark study was conducted to evaluate the available synthetic colon datasets for the tasks of depth and pose estimation. Results demonstrate that the high realism and variability of RealSynCol significantly enhance generalization performance on clinical images, proving it to be a powerful tool for developing deep learning algorithms to support endoscopic diagnosis.

Related papers

Zero-shot Monocular Metric Depth for Endoscopic Images [9.205799953828896]
We present a benchmark of state-of-the-art (metric and relative) depth estimation models evaluated on real, unseen endoscopic images.<n>We publish a novel synthetic dataset (Endo Synth) of endoscopic surgical instruments paired with ground truth metric depth and segmentation masks.<n>We demonstrate that fine-tuning depth foundation models using our synthetic dataset boosts accuracy on most unseen real data by a significant margin.
arXiv Detail & Related papers (2025-09-23T04:56:25Z)
Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey [171.72616707259306]
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins.<n>Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis.
arXiv Detail & Related papers (2025-07-19T06:13:25Z)
R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation [78.26308457952636]
This paper introduces R3D2, a lightweight, one-step diffusion model designed to overcome limitations in autonomous driving simulation.<n>It enables realistic insertion of complete 3D assets into existing scenes by generating plausible rendering effects-such as shadows and consistent lighting-in real time.<n>We show that R3D2 significantly enhances the realism of inserted assets, enabling use-cases like text-to-3D asset insertion and cross-scene/dataset object transfer.
arXiv Detail & Related papers (2025-06-09T14:50:19Z)
Structure-preserving Image Translation for Depth Estimation in Colonoscopy Video [1.0485739694839669]
We propose a pipeline of structure-preserving synthetic-to-real (sim2real) image translation. This allows us to generate large quantities of realistic-looking synthetic images for supervised depth estimation. We also propose a dataset of hand-picked sequences from clinical colonoscopies to improve the image translation process.
arXiv Detail & Related papers (2024-08-19T17:02:16Z)
EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting [39.60431471170721]
3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available. We propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as textitEndoSparse.
arXiv Detail & Related papers (2024-07-01T07:24:09Z)
Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns. A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z)
Domain adaptation strategies for 3D reconstruction of the lumbar spine using real fluoroscopy data [9.21828361691977]
This study tackles key obstacles in adopting surgical navigation in orthopedic surgeries. It shows an approach for generating 3D anatomical models of the spine from only a few fluoroscopic images. It achieved an 84% F1 score, matching the accuracy of our previous synthetic data-based research.
arXiv Detail & Related papers (2024-01-29T10:22:45Z)
A geometry-aware deep network for depth estimation in monocular endoscopy [17.425158094539462]
The proposed method is extensively validated across different datasets and clinical images. The generalizability of the proposed method achieves mean RMSE values of 12.604 (T1-L1), 9.930 (T2-L2), and 13.893 (colon) on the ColonDepth dataset.
arXiv Detail & Related papers (2023-04-20T11:59:32Z)
Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics. Recent neural implicit modeling methods show promising results on synthetic or dense datasets. But, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
Transferable Active Grasping and Real Embodied Dataset [48.887567134129306]
We show how to search for feasible viewpoints for grasping by the use of hand-mounted RGB-D cameras. A practical 3-stage transferable active grasping pipeline is developed, that is adaptive to unseen clutter scenes. In our pipeline, we propose a novel mask-guided reward to overcome the sparse reward issue in grasping and ensure category-irrelevant behavior.
arXiv Detail & Related papers (2020-04-28T08:15:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.