Related papers: Deep Learning-Based Semantic Segmentation for Real-Time Kidney Imaging and Measurements with Augmented Reality-Assisted Ultrasound

Deep Learning-Based Semantic Segmentation for Real-Time Kidney Imaging and Measurements with Augmented Reality-Assisted Ultrasound

URL: http://arxiv.org/abs/2506.23721v1
Date: Mon, 30 Jun 2025 10:49:54 GMT
Title: Deep Learning-Based Semantic Segmentation for Real-Time Kidney Imaging and Measurements with Augmented Reality-Assisted Ultrasound
Authors: Gijs Luijten, Roberto Maria Scardigno, Lisle Faray de Paiva, Peter Hoyer, Jens Kleesiek, Domenico Buongiorno, Vitoantonio Bevilacqua, Jan Egger,
Abstract summary: We integrate deep learning (DL)-based semantic segmentation for real-time (RT) automated kidney volumetric measurements.<n> augmented reality (AR) enhances the usability of Ultrasound (US) by projecting the display directly into the clinician's field of view.<n>Our open-source GitHub pipeline includes model implementations, measurement algorithms, and a Wi-Fi-based streaming solution.
Score: 1.7713240943169457
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Ultrasound (US) is widely accessible and radiation-free but has a steep learning curve due to its dynamic nature and non-standard imaging planes. Additionally, the constant need to shift focus between the US screen and the patient poses a challenge. To address these issues, we integrate deep learning (DL)-based semantic segmentation for real-time (RT) automated kidney volumetric measurements, which are essential for clinical assessment but are traditionally time-consuming and prone to fatigue. This automation allows clinicians to concentrate on image interpretation rather than manual measurements. Complementing DL, augmented reality (AR) enhances the usability of US by projecting the display directly into the clinician's field of view, improving ergonomics and reducing the cognitive load associated with screen-to-patient transitions. Two AR-DL-assisted US pipelines on HoloLens-2 are proposed: one streams directly via the application programming interface for a wireless setup, while the other supports any US device with video output for broader accessibility. We evaluate RT feasibility and accuracy using the Open Kidney Dataset and open-source segmentation models (nnU-Net, Segmenter, YOLO with MedSAM and LiteMedSAM). Our open-source GitHub pipeline includes model implementations, measurement algorithms, and a Wi-Fi-based streaming solution, enhancing US training and diagnostics, especially in point-of-care settings.

Related papers

EasyREG: Easy Depth-Based Markerless Registration and Tracking using Augmented Reality Device for Surgical Guidance [10.156036566483888]
We present a markerless framework that relies only on the depth sensor of AR devices.<n>The registration module integrates depth sensor error correction, a human-in-the-loop region filtering technique, and a robust global alignment.<n>The tracking module employs a fast and robust registration algorithm that uses the initial pose from the registration module to estimate the target pose in real-time.
arXiv Detail & Related papers (2025-04-13T09:48:33Z)
MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline [1.4910709350090976]
We introduce a novel pipeline called MEDDAP to augment existing small datasets by automatically generating new informative labeled samples. USLoRA allows for selective fine-tuning of weights within SD, requiring fewer than 0.1% of parameters compared to fully fine-tuning only the UNet portion of SD. This approach is inspired by clinicians' decision-making processes regarding breast tumors, where tumor shape often plays a more crucial role than intensity.
arXiv Detail & Related papers (2024-03-25T00:17:43Z)
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
Training-free image style alignment for self-adapting domain shift on handheld ultrasound devices [54.476120039032594]
We propose the Training-free Image Style Alignment (TISA) framework to align the style of handheld device data to those of standard devices. TISA can directly infer handheld device images without extra training and is suited for clinical applications.
arXiv Detail & Related papers (2024-02-17T07:15:23Z)
Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support* Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit. Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
Transformer Lesion Tracker [12.066026343488453]
We propose a transformer-based approach, termed Transformer Lesion Tracker (TLT) We design a Cross Attention-based Transformer (CAT) to capture and combine both global and local information to enhance feature extraction. We conduct experiments on a public dataset to show the superiority of our method and find that our model performance has improved the average Euclidean center error by at least 14.3%.
arXiv Detail & Related papers (2022-06-13T15:35:24Z)
Unsupervised multi-latent space reinforcement learning framework for video summarization in ultrasound imaging [0.0]
The COVID-19 pandemic has highlighted the need for a tool to speed up triage in ultrasound scans. The proposed video-summarization technique is a step in this direction. We propose a new unsupervised reinforcement learning framework with novel rewards.
arXiv Detail & Related papers (2021-09-03T04:50:35Z)
SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based Augmented Reality for Surgical Guidance [18.060445966264727]
SurgeonAssist-Net is a framework making action-and-workflow-driven virtual assistance accessible to commercially available optical see-through head-mounted displays (OST-HMDs) Our implementation competes with state-of-the-art approaches in prediction accuracy for automated task recognition. It is capable of near real-time performance on the Microsoft HoloLens 2 OST-HMD.
arXiv Detail & Related papers (2021-07-13T21:12:34Z)
One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video [71.43912903508765]
MDAL is a dynamic online adaptive learning scheme for instrument segmentation in robot-assisted surgery. It learns the general knowledge of instruments and the fast adaptation ability through the video-specific meta-learning paradigm. It outperforms other state-of-the-art methods on two datasets.
arXiv Detail & Related papers (2021-03-24T05:02:18Z)
Relational Graph Learning on Visual and Kinematics Embeddings for Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information. The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.