eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes
- URL: http://arxiv.org/abs/2504.21562v1
- Date: Wed, 30 Apr 2025 12:06:56 GMT
- Title: eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes
- Authors: Henry John Krumb, Anirban Mukhopadhyay,
- Abstract summary: Wireless Capsule Endoscopy is a pain-free alternative to traditional endoscopy.<n> Techniques like bleeding detection and depth estimation can help with localization of pathologies, but deep learning models are typically too large to run directly on the capsule.<n>We distill a large foundation model into the lean NCA architecture, by treating the outputs of the foundation model as pseudo ground truth.<n>We then port the trained NCA to the ESP32 microcontroller, enabling efficient image processing on hardware as small as a camera capsule.
- Score: 1.3270838622986498
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Wireless Capsule Endoscopy is a non-invasive imaging method for the entire gastrointestinal tract, and is a pain-free alternative to traditional endoscopy. It generates extensive video data that requires significant review time, and localizing the capsule after ingestion is a challenge. Techniques like bleeding detection and depth estimation can help with localization of pathologies, but deep learning models are typically too large to run directly on the capsule. Neural Cellular Automata (NCA) for bleeding segmentation and depth estimation are trained on capsule endoscopic images. For monocular depth estimation, we distill a large foundation model into the lean NCA architecture, by treating the outputs of the foundation model as pseudo ground truth. We then port the trained NCA to the ESP32 microcontroller, enabling efficient image processing on hardware as small as a camera capsule. NCA are more accurate (Dice) than other portable segmentation models, while requiring more than 100x fewer parameters stored in memory than other small-scale models. The visual results of NCA depth estimation look convincing, and in some cases beat the realism and detail of the pseudo ground truth. Runtime optimizations on the ESP32-S3 accelerate the average inference speed significantly, by more than factor 3. With several algorithmic adjustments and distillation, it is possible to eNCApsulate NCA models into microcontrollers that fit into wireless capsule endoscopes. This is the first work that enables reliable bleeding segmentation and depth estimation on a miniaturized device, paving the way for precise diagnosis combined with visual odometry as a means of precise localization of the capsule -- on the capsule.
Related papers
- Enhanced Anomaly Detection for Capsule Endoscopy Using Ensemble Learning Strategies [0.8824955686704116]
This work introduces an ensemble strategy to address the challenge in anomaly detection tasks in video capsule endoscopies.<n>We propose using various loss functions, drawn from the anomaly detection field, to train each network.<n>We achieve an AUC score of 76.86% on the Kvasir-Capsule and an AUC score of 76.98% on the Galar dataset.
arXiv Detail & Related papers (2025-04-08T13:39:39Z) - KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation [46.57880203321858]
We propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module.
Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules.
The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset.
arXiv Detail & Related papers (2024-10-28T16:00:42Z) - Enhancing Angular Resolution via Directionality Encoding and Geometric Constraints in Brain Diffusion Tensor Imaging [70.66500060987312]
Diffusion-weighted imaging (DWI) is a type of Magnetic Resonance Imaging (MRI) technique sensitised to the diffusivity of water molecules.
This work proposes DirGeo-DTI, a deep learning-based method to estimate reliable DTI metrics even from a set of DWIs acquired with the minimum theoretical number (6) of gradient directions.
arXiv Detail & Related papers (2024-09-11T11:12:26Z) - Detection of Intracranial Hemorrhage for Trauma Patients [1.0074894923170512]
We propose a novel Voxel-Complete IoU (VC-IoU) loss that encourages the network to learn the 3D aspect ratios of bounding boxes.
We extensively experiment on brain bleeding detection using a publicly available dataset, and validate it on a private cohort.
arXiv Detail & Related papers (2024-08-20T12:03:59Z) - Weakly-supervised positional contrastive learning: application to
cirrhosis classification [45.63061034568991]
Large medical imaging datasets can be cheaply annotated with low-confidence, weak labels.
Access to high-confidence labels, such as histology-based diagnoses, is rare and costly.
We propose an efficient weakly-supervised positional (WSP) contrastive learning strategy.
arXiv Detail & Related papers (2023-07-10T15:02:13Z) - AttResDU-Net: Medical Image Segmentation Using Attention-based Residual
Double U-Net [0.0]
This paper proposes an attention-based residual Double U-Net architecture (AttResDU-Net) that improves on the existing medical image segmentation networks.
We conducted experiments on three datasets: CVC Clinic-DB, ISIC 2018, and the 2018 Data Science Bowl datasets and achieved Dice Coefficient scores of 94.35%, 91.68%, and 92.45% respectively.
arXiv Detail & Related papers (2023-06-25T14:28:08Z) - Hepatic vessel segmentation based on 3Dswin-transformer with inductive
biased multi-head self-attention [46.46365941681487]
We propose a robust end-to-end vessel segmentation network called Indu BIased Multi-Head Attention Vessel Net.
We introduce the voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels.
On the other hand, we propose inductive biased multi-head self-attention which learns inductive biased relative positional embedding from absolute position embedding.
arXiv Detail & Related papers (2021-11-05T10:17:08Z) - NanoNet: Real-Time Polyp Segmentation in Video Capsule Endoscopy and
Colonoscopy [0.6125117548653111]
We propose NanoNet, a novel architecture for the segmentation of video capsule endoscopy and colonoscopy images.
Our proposed architecture allows real-time performance and has higher segmentation accuracy compared to other more complex ones.
arXiv Detail & Related papers (2021-04-22T15:40:28Z) - Lesion2Vec: Deep Metric Learning for Few-Shot Multiple Lesions
Recognition in Wireless Capsule Endoscopy Video [0.0]
Wireless Capsule Endoscopy (WCE) has revolutionized traditional endoscopy procedure by allowing gastroenterologists visualize the entire GI tract non-invasively.
A single video can last up to 8 hours producing between 30,000 to 100,000 images.
We propose a metric-based learning framework followed by a few-shot lesion recognition in WCE data.
arXiv Detail & Related papers (2021-01-11T23:58:56Z) - Weakly-supervised Learning For Catheter Segmentation in 3D Frustum
Ultrasound [74.22397862400177]
We propose a novel Frustum ultrasound based catheter segmentation method.
The proposed method achieved the state-of-the-art performance with an efficiency of 0.25 second per volume.
arXiv Detail & Related papers (2020-10-19T13:56:22Z) - Deep Q-Network-Driven Catheter Segmentation in 3D US by Hybrid
Constrained Semi-Supervised Learning and Dual-UNet [74.22397862400177]
We propose a novel catheter segmentation approach, which requests fewer annotations than the supervised learning method.
Our scheme considers a deep Q learning as the pre-localization step, which avoids voxel-level annotation.
With the detected catheter, patch-based Dual-UNet is applied to segment the catheter in 3D volumetric data.
arXiv Detail & Related papers (2020-06-25T21:10:04Z) - Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion
Encoding (SIDE) [50.65891535040752]
We propose a diffusion encoding scheme, called Slice-Interleaved Diffusion.
SIDE, that interleaves each diffusion-weighted (DW) image volume with slices encoded with different diffusion gradients.
We also present a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data.
arXiv Detail & Related papers (2020-02-25T14:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.