Related papers: NanoNet: Real-Time Polyp Segmentation in Video Capsule Endoscopy and Colonoscopy

NanoNet: Real-Time Polyp Segmentation in Video Capsule Endoscopy and Colonoscopy

URL: http://arxiv.org/abs/2104.11138v1
Date: Thu, 22 Apr 2021 15:40:28 GMT
Title: NanoNet: Real-Time Polyp Segmentation in Video Capsule Endoscopy and Colonoscopy
Authors: Debesh Jha, Nikhil Kumar Tomar, Sharib Ali, Michael A. Riegler, H{\aa}vard D. Johansen, Dag Johansen, Thomas de Lange, P{\aa}l Halvorsen
Abstract summary: We propose NanoNet, a novel architecture for the segmentation of video capsule endoscopy and colonoscopy images. Our proposed architecture allows real-time performance and has higher segmentation accuracy compared to other more complex ones.
Score: 0.6125117548653111
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning in gastrointestinal endoscopy can assist to improve clinical performance and be helpful to assess lesions more accurately. To this extent, semantic segmentation methods that can perform automated real-time delineation of a region-of-interest, e.g., boundary identification of cancer or precancerous lesions, can benefit both diagnosis and interventions. However, accurate and real-time segmentation of endoscopic images is extremely challenging due to its high operator dependence and high-definition image quality. To utilize automated methods in clinical settings, it is crucial to design lightweight models with low latency such that they can be integrated with low-end endoscope hardware devices. In this work, we propose NanoNet, a novel architecture for the segmentation of video capsule endoscopy and colonoscopy images. Our proposed architecture allows real-time performance and has higher segmentation accuracy compared to other more complex ones. We use video capsule endoscopy and standard colonoscopy datasets with polyps, and a dataset consisting of endoscopy biopsies and surgical instruments, to evaluate the effectiveness of our approach. Our experiments demonstrate the increased performance of our architecture in terms of a trade-off between model complexity, speed, model parameters, and metric performances. Moreover, the resulting model size is relatively tiny, with only nearly 36,000 parameters compared to traditional deep learning approaches having millions of parameters.

Related papers

Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance [50.486523249499115]
Real-time video understanding is critical to guide procedures in minimally invasive surgery (MIS)<n>We propose Compress-to-Explore (C2E), a novel self-supervised framework to learn compact, informative representations from surgical videos.<n>C2E uses entropy-maximizing decoders to compress images while preserving clinically relevant details, improving encoder performance without labeled data.
arXiv Detail & Related papers (2025-05-16T14:02:24Z)
A Temporal Convolutional Network-Based Approach and a Benchmark Dataset for Colonoscopy Video Temporal Segmentation [3.146247125118741]
ColonTCN is a learning-based architecture that employs custom temporal convolutional blocks to efficiently capture temporal dependencies for the temporal segmentation of colonoscopy videos. ColonTCN achieves state-of-the-art performance in classification accuracy while maintaining a low parameter count when evaluated. We believe that the proposed open-access benchmark and the ColonTCN approach represent a significant advancement in the temporal segmentation of colonoscopy procedures.
arXiv Detail & Related papers (2025-02-05T18:21:56Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions. Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting [53.38166294158047]
EndoGSLAM is an efficient approach for endoscopic surgeries, which integrates streamlined representation and differentiable Gaussianization. Experiments show that EndoGSLAM achieves a better trade-off between intraoperative availability and reconstruction quality than traditional or neural SLAM approaches.
arXiv Detail & Related papers (2024-03-22T11:27:43Z)
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training. We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue. We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch. This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z)
An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network [2.752817022620644]
The potential of using machine learning for image segmentation in total knee lies in its ability to improve segmentation accuracy, automate the process, and provide real-time assistance to surgeons. This paper proposes a methodology to use deep learning for robust real-time total knee image segmentation. The deep learning model, trained on a large dataset, demonstrates outstanding performance in accurately segmenting both the implanted femur and tibia.
arXiv Detail & Related papers (2024-01-23T05:00:02Z)
Phase-Specific Augmented Reality Guidance for Microscopic Cataract Surgery Using Long-Short Spatiotemporal Aggregation Transformer [14.568834378003707]
Phaemulsification cataract surgery (PCS) is a routine procedure using a surgical microscope. PCS guidance systems extract valuable information from surgical microscopic videos to enhance proficiency. Existing PCS guidance systems suffer from non-phasespecific guidance, leading to redundant visual information. We propose a novel phase-specific augmented reality (AR) guidance system, which offers tailored AR information corresponding to the recognized surgical phase.
arXiv Detail & Related papers (2023-09-11T02:56:56Z)
3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation. Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z)
Intra-operative Brain Tumor Detection with Deep Learning-Optimized Hyperspectral Imaging [37.21885467891782]
Surgery for gliomas (intrinsic brain tumors) is challenging due to the infiltrative nature of the lesion. No real-time, intra-operative, label-free and wide-field tool is available to assist and guide the surgeon to find the relevant demarcations for these tumors. We build a deep-learning-based diagnostic tool for cancer resection with potential for intra-operative guidance.
arXiv Detail & Related papers (2023-02-06T15:52:03Z)
A Temporal Learning Approach to Inpainting Endoscopic Specularities and Its effect on Image Correspondence [13.25903945009516]
We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities. This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner. We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation.
arXiv Detail & Related papers (2022-03-31T13:14:00Z)
Towards Robotic Knee Arthroscopy: Multi-Scale Network for Tissue-Tool Segmentation [0.0]
We present a densely connected shape aware multi-scale segmentation model which captures multi-scale features and integrates shape features to achieve tissue-tool segmentations. With the publicly available polyp dataset our proposed model achieved 5.09 % accuracy improvement.
arXiv Detail & Related papers (2021-10-06T11:20:01Z)
A parameter refinement method for Ptychography based on Deep Learning concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
Searching for Efficient Architecture for Instrument Segmentation in Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images. We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.