Related papers: VR-Caps: A Virtual Environment for Capsule Endoscopy

VR-Caps: A Virtual Environment for Capsule Endoscopy

URL: http://arxiv.org/abs/2008.12949v2
Date: Thu, 14 Jan 2021 12:55:11 GMT
Title: VR-Caps: A Virtual Environment for Capsule Endoscopy
Authors: Kagan Incetan, Ibrahim Omer Celik, Abdulhamid Obeid, Guliz Irem Gokceler, Kutsev Bengisu Ozyoruk, Yasin Almalioglu, Richard J. Chen, Faisal Mahmood, Hunter Gilbert, Nicholas J. Durr, Mehmet Turan
Abstract summary: Current capsule endoscopes and next-generation robotic capsules for diagnosis and treatment of gastrointestinal diseases are complex cyber-physical platforms. Data-driven algorithms promise to enable many advanced functionalities for capsule endoscopes, but real-world data is challenging to obtain. Physically-realistic simulations providing synthetic data have emerged as a solution to the development of data-driven algorithms.
Score: 8.499489366784374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current capsule endoscopes and next-generation robotic capsules for diagnosis and treatment of gastrointestinal diseases are complex cyber-physical platforms that must orchestrate complex software and hardware functions. The desired tasks for these systems include visual localization, depth estimation, 3D mapping, disease detection and segmentation, automated navigation, active control, path realization and optional therapeutic modules such as targeted drug delivery and biopsy sampling. Data-driven algorithms promise to enable many advanced functionalities for capsule endoscopes, but real-world data is challenging to obtain. Physically-realistic simulations providing synthetic data have emerged as a solution to the development of data-driven algorithms. In this work, we present a comprehensive simulation platform for capsule endoscopy operations and introduce VR-Caps, a virtual active capsule environment that simulates a range of normal and abnormal tissue conditions (e.g., inflated, dry, wet etc.) and varied organ types, capsule endoscope designs (e.g., mono, stereo, dual and 360{\deg}camera), and the type, number, strength, and placement of internal and external magnetic sources that enable active locomotion. VR-Caps makes it possible to both independently or jointly develop, optimize, and test medical imaging and analysis software for the current and next-generation endoscopic capsule systems. To validate this approach, we train state-of-the-art deep neural networks to accomplish various medical image analysis tasks using simulated data from VR-Caps and evaluate the performance of these models on real medical data. Results demonstrate the usefulness and effectiveness of the proposed virtual platform in developing algorithms that quantify fractional coverage, camera trajectory, 3D map reconstruction, and disease classification.

Related papers

A Skull-Adaptive Framework for AI-Based 3D Transcranial Focused Ultrasound Simulation [1.662610796043078]
Transcranial focused ultrasound (tFUS) is an emerging modality for non-invasive brain stimulation and therapeutic intervention.<n>TFUScapes is the first large-scale, high-resolution dataset of tFUS simulations through anatomically realistic human skulls.<n>DeepTFUS is a deep learning model that estimates normalized pressure fields directly from input 3D CT volumes and transducer position.
arXiv Detail & Related papers (2025-05-19T11:37:51Z)
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance [79.66329903007869]
We present EchoWorld, a motion-aware world modeling framework for probe guidance. It encodes anatomical knowledge and motion-induced visual dynamics. It is trained on more than one million ultrasound images from over 200 routine scans.
arXiv Detail & Related papers (2025-04-17T16:19:05Z)
V$^2$-SfMLearner: Learning Monocular Depth and Ego-motion for Multimodal Wireless Capsule Endoscopy [37.63512910531616]
Deep learning can predict depth maps and capsule ego-motion from capsule endoscopy videos, aiding in 3D scene reconstruction and lesion localization. Existing solutions focus solely on vision-based processing, neglecting other auxiliary signals like vibrations. We propose V$2$-SfMLearner, a multimodal approach integrating vibration signals into vision-based depth and capsule motion estimation.
arXiv Detail & Related papers (2024-12-23T14:11:30Z)
SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models [1.28795255913358]
We introduce a fully-fledged surgical simulator that automatically produces all necessary annotations for modern CAS systems. It offers a more complex and realistic simulation of surgical interactions, including the dynamics between surgical instruments and deformable anatomical environments. We propose a lightweight and flexible image-to-image translation method based on Stable Diffusion and Low-Rank Adaptation.
arXiv Detail & Related papers (2024-12-03T09:49:43Z)
fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction [50.534007259536715]
We present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4768 3D objects. We propose MinD-3D, a novel framework designed to decode 3D visual information from fMRI signals.
arXiv Detail & Related papers (2024-09-17T16:13:59Z)
Brain3D: Generating 3D Objects from fMRI [76.41771117405973]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject. We show that our model captures the distinct functionalities of each region of human vision system. Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z)
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
Domain adaptation strategies for 3D reconstruction of the lumbar spine using real fluoroscopy data [9.21828361691977]
This study tackles key obstacles in adopting surgical navigation in orthopedic surgeries. It shows an approach for generating 3D anatomical models of the spine from only a few fluoroscopic images. It achieved an 84% F1 score, matching the accuracy of our previous synthetic data-based research.
arXiv Detail & Related papers (2024-01-29T10:22:45Z)
DeepMediX: A Deep Learning-Driven Resource-Efficient Medical Diagnosis Across the Spectrum [15.382184404673389]
This work presents textttDeepMediX, a groundbreaking, resource-efficient model that significantly addresses this challenge. Built on top of the MobileNetV2 architecture, DeepMediX excels in classifying brain MRI scans and skin cancer images. DeepMediX's design also includes the concept of Federated Learning, enabling a collaborative learning approach without compromising data privacy.
arXiv Detail & Related papers (2023-07-01T12:30:58Z)
Domain Adaptive Sim-to-Real Segmentation of Oropharyngeal Organs Towards Robot-assisted Intubation [15.795665057836636]
This work introduces a virtual dataset generated by the Open Framework Architecture framework to overcome the limited availability of actual endoscopic images. We also propose a domain adaptive Sim-to-Real method for oropharyngeal organ image segmentation, which employs an image blending strategy. Experimental results demonstrate the superior performance of the proposed approach with domain adaptive models.
arXiv Detail & Related papers (2023-05-19T14:08:15Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
Surgical Visual Domain Adaptation: Results from the MICCAI 2020 SurgVisDom Challenge [9.986124942784969]
This work seeks to explore the potential for visual domain adaptation in surgery to overcome data privacy concerns. In particular, we propose to use video from virtual reality (VR) simulations of surgical exercises to develop algorithms to recognize tasks in a clinical-like setting. We present the performance of the different approaches to solve visual domain adaptation developed by challenge participants.
arXiv Detail & Related papers (2021-02-26T18:45:28Z)
Synthesizing Skeletal Motion and Physiological Signals as a Function of a Virtual Human's Actions and Emotions [10.59409233835301]
We develop for the first time a system consisting of computational models for synchronously skeletal motion, electrocardiogram, blood pressure, respiration, and skin conductance signals. The proposed framework is modular and allows the flexibility to experiment with different models. In addition to facilitating ML research for round-the-clock monitoring at a reduced cost, the proposed framework will allow reusability of code and data.
arXiv Detail & Related papers (2021-02-08T21:56:15Z)
Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules. We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z)
Transferable Active Grasping and Real Embodied Dataset [48.887567134129306]
We show how to search for feasible viewpoints for grasping by the use of hand-mounted RGB-D cameras. A practical 3-stage transferable active grasping pipeline is developed, that is adaptive to unseen clutter scenes. In our pipeline, we propose a novel mask-guided reward to overcome the sparse reward issue in grasping and ensure category-irrelevant behavior.
arXiv Detail & Related papers (2020-04-28T08:15:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.