Related papers: MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

URL: http://arxiv.org/abs/2404.02999v1
Date: Wed, 3 Apr 2024 18:40:48 GMT
Title: MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy
Authors: John J. Han, Ayberk Acar, Nicholas Kavoussi, Jie Ying Wu,
Abstract summary: Style transfer is a promising approach to close the sim-to-real gap in medical endoscopy. Rendering realistic endoscopic videos by traversing pre-operative scans can generate realistic simulations as well as ground truth camera poses and depth maps. We propose MeshBrush, a neural mesh stylization method to synthesize temporally consistent videos with differentiable rendering.
Score: 0.8437187555622164
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Style transfer is a promising approach to close the sim-to-real gap in medical endoscopy. Rendering realistic endoscopic videos by traversing pre-operative scans (such as MRI or CT) can generate realistic simulations as well as ground truth camera poses and depth maps. Although image-to-image (I2I) translation models such as CycleGAN perform well, they are unsuitable for video-to-video synthesis due to the lack of temporal consistency, resulting in artifacts between frames. We propose MeshBrush, a neural mesh stylization method to synthesize temporally consistent videos with differentiable rendering. MeshBrush uses the underlying geometry of patient imaging data while leveraging existing I2I methods. With learned per-vertex textures, the stylized mesh guarantees consistency while producing high-fidelity outputs. We demonstrate that mesh stylization is a promising approach for creating realistic simulations for downstream tasks such as training and preoperative planning. Although our method is tested and designed for ureteroscopy, its components are transferable to general endoscopic and laparoscopic procedures.

Related papers

SurGrID: Controllable Surgical Simulation via Scene Graph to Image Diffusion [0.8680185045005854]
We introduce SurGrID, a Scene Graph to Image Diffusion Model, allowing for controllable surgical scene synthesis. Scene Graphs encode a surgical scene's components' spatial and semantic information, which are then translated into an intermediate representation. Our proposed method improves the fidelity of generated images and their coherence with the graph input over the state-of-the-art.
arXiv Detail & Related papers (2025-02-11T20:49:13Z)
Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z)
Mastoidectomy Multi-View Synthesis from a Single Microscopy Image [4.777201894011511]
We introduce a novel pipeline that is capable of generating synthetic multi-view videos from a single CI microscope image. We use a patient's pre-operative CT scan to predict the post-mastoidectomy surface using a method designed for this purpose.
arXiv Detail & Related papers (2024-08-31T16:45:24Z)
EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting [39.60431471170721]
3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available. We propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as textitEndoSparse.
arXiv Detail & Related papers (2024-07-01T07:24:09Z)
Interactive Generation of Laparoscopic Videos with Diffusion Models [1.5488613349551188]
We show how to generate realistic laparoscopic images and videos by specifying a surgical action through text. We demonstrate the performance of our approach using the publicly available Cholec dataset family. We achieve an FID of 38.097 and an F1-score of 0.71.
arXiv Detail & Related papers (2024-04-23T12:36:07Z)
Endora: Video Generation Models as Endoscopy Simulators [53.72175969751398]
This paper introduces model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes. We also pioneer the first public benchmark for endoscopy simulation with video generation models. Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research.
arXiv Detail & Related papers (2024-03-17T00:51:59Z)
DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z)
Realistic Endoscopic Image Generation Method Using Virtual-to-real Image-domain Translation [1.1580916951856253]
We propose a realistic image generation method for endoscopic simulation systems. Virtual endoscopic images are generated by using a volume rendering method from a CT volume of a patient. We improve the reality of the virtual endoscopic images using a virtual-to-real image-domain translation technique.
arXiv Detail & Related papers (2022-01-13T12:18:51Z)
A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture. The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses. We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z)
Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data [0.059110875077162096]
We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes. By introducing global learnable textures and a lighting-invariant view-consistency loss, our method produces consistent translations of arbitrary views. By extending existing image-based methods to view-consistent videos, we aim to impact the applicability of simulated training and evaluation environments for surgical applications.
arXiv Detail & Related papers (2021-03-31T16:31:26Z)
Towards Unsupervised Learning for Instrument Segmentation in Robotic Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation. Our approach allows to train image segmentation models without the need to acquire expensive annotations. We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.