Depth-Aware Endoscopic Video Inpainting
- URL: http://arxiv.org/abs/2407.02675v1
- Date: Tue, 2 Jul 2024 21:28:36 GMT
- Title: Depth-Aware Endoscopic Video Inpainting
- Authors: Francis Xiatian Zhang, Shuang Chen, Xianghua Xie, Hubert P. H. Shum,
- Abstract summary: Video inpainting fills in corrupted video content with plausible replacements.
Recent advances in endoscopic video inpainting have shown potential for enhancing the quality of endoscopic videos.
They mainly repair 2D visual information without preserving crucial 3D spatial details for clinical reference.
We introduce a novel Depth-aware Endoscopic Video Inpainting framework.
- Score: 11.885452717243744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video inpainting fills in corrupted video content with plausible replacements. While recent advances in endoscopic video inpainting have shown potential for enhancing the quality of endoscopic videos, they mainly repair 2D visual information without effectively preserving crucial 3D spatial details for clinical reference. Depth-aware inpainting methods attempt to preserve these details by incorporating depth information. Still, in endoscopic contexts, they face challenges including reliance on pre-acquired depth maps, less effective fusion designs, and ignorance of the fidelity of 3D spatial details. To address them, we introduce a novel Depth-aware Endoscopic Video Inpainting (DAEVI) framework. It features a Spatial-Temporal Guided Depth Estimation module for direct depth estimation from visual features, a Bi-Modal Paired Channel Fusion module for effective channel-by-channel fusion of visual and depth information, and a Depth Enhanced Discriminator to assess the fidelity of the RGB-D sequence comprised of the inpainted frames and estimated depth images. Experimental evaluations on established benchmarks demonstrate our framework's superiority, achieving a 2% improvement in PSNR and a 6% reduction in MSE compared to state-of-the-art methods. Qualitative analyses further validate its enhanced ability to inpaint fine details, highlighting the benefits of integrating depth information into endoscopic inpainting.
Related papers
- BrightVAE: Luminosity Enhancement in Underexposed Endoscopic Images [6.687072439993227]
Underexposed endoscopic images often suffer from reduced contrast and uneven brightness.
We introduce BrightVAE, an architecture based on the hierarchical Vector Quantized Variational Autoencoder (hierarchical VQ-VAE)
Our architecture is meticulously designed to tackle the unique challenges inherent in endoscopic imaging.
arXiv Detail & Related papers (2024-11-22T01:41:27Z) - Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge.
Our method extracts low-level features from edges and textures to create a texture image.
By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z) - A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion [10.519644854849098]
We propose a two-step Transformer-based network for indoor depth completion.
Our proposed network achieves the state-of-the-art performance on the Matterport3D dataset.
In addition, to validate the importance of the depth completion task, we apply our methods to indoor 3D reconstruction.
arXiv Detail & Related papers (2024-06-14T07:42:27Z) - Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos [12.497782583094281]
Monocular depth estimation in endoscopy videos can enable assistive and robotic surgery to obtain better coverage of the organ and detection of various health issues.
Despite promising progress on mainstream, natural image depth estimation, techniques perform poorly on endoscopy images.
In this paper, we utilize the photometric cues, i.e., the light emitted from an endoscope and reflected by the surface, to improve monocular depth estimation.
arXiv Detail & Related papers (2024-03-26T17:52:23Z) - MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to
Depth-aware Video Panoptic Segmentation [3.2489082010225494]
We propose a novel solution with a multi-task network that performs monocular depth estimation and video panoptic segmentation.
We introduce panoptic-guided depth losses and a novel panoptic masking scheme for moving objects to avoid corrupting the training signal.
arXiv Detail & Related papers (2022-10-14T07:00:42Z) - 3D endoscopic depth estimation using 3D surface-aware constraints [16.161276518580262]
We show that depth estimation can be reformed from a 3D surface perspective.
We propose a loss function for depth estimation that integrates the surface-aware constraints.
Camera parameters are incorporated into the training pipeline to increase the control and transparency of the depth estimation.
arXiv Detail & Related papers (2022-03-04T04:47:20Z) - Adversarial Domain Feature Adaptation for Bronchoscopic Depth Estimation [111.89519571205778]
In this work, we propose an alternative domain-adaptive approach to depth estimation.
Our novel two-step structure first trains a depth estimation network with labeled synthetic images in a supervised manner.
The results of our experiments show that the proposed method improves the network's performance on real images by a considerable margin.
arXiv Detail & Related papers (2021-09-24T08:11:34Z) - Deep Direct Volume Rendering: Learning Visual Feature Mappings From
Exemplary Images [57.253447453301796]
We introduce Deep Direct Volume Rendering (DeepDVR), a generalization of Direct Volume Rendering (DVR) that allows for the integration of deep neural networks into the DVR algorithm.
We conceptualize the rendering in a latent color space, thus enabling the use of deep architectures to learn implicit mappings for feature extraction and classification.
Our generalization serves to derive novel volume rendering architectures that can be trained end-to-end directly from examples in image space.
arXiv Detail & Related papers (2021-06-09T23:03:00Z) - Progressive Depth Learning for Single Image Dehazing [56.71963910162241]
Existing dehazing methods often ignore the depth cues and fail in distant areas where heavier haze disturbs the visibility.
We propose a deep end-to-end model that iteratively estimates image depths and transmission maps.
Our approach benefits from explicitly modeling the inner relationship of image depth and transmission map, which is especially effective for distant hazy areas.
arXiv Detail & Related papers (2021-02-21T05:24:18Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z) - Video Depth Estimation by Fusing Flow-to-Depth Proposals [65.24533384679657]
We present an approach with a differentiable flow-to-depth layer for video depth estimation.
The model consists of a flow-to-depth layer, a camera pose refinement module, and a depth fusion network.
Our approach outperforms state-of-the-art depth estimation methods, and has reasonable cross dataset generalization capability.
arXiv Detail & Related papers (2019-12-30T10:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.