Related papers: UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation

UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation

URL: http://arxiv.org/abs/2406.01154v2
Date: Fri, 21 Jun 2024 02:22:56 GMT
Title: UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation
Authors: Zehui Lin, Zhuoneng Zhang, Xindi Hu, Zhifan Gao, Xin Yang, Yue Sun, Dong Ni, Tao Tan,
Abstract summary: We propose a novel universal framework for ultrasound, namely UniUSNet. UniUSNet is a promptable framework for ultrasound image classification and segmentation. We train and validate our proposed model, and surpass both a model trained on a single dataset and an ablated version of the network lacking prompt guidance.
Score: 19.85119434049726
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ultrasound is a widely used imaging modality in clinical practice due to its low cost, portability, and safety. Current research in general AI for healthcare focuses on large language models and general segmentation models, with insufficient attention to solutions addressing both disease prediction and tissue segmentation. In this study, we propose a novel universal framework for ultrasound, namely UniUSNet, which is a promptable framework for ultrasound image classification and segmentation. The universality of this model is derived from its versatility across various aspects. It proficiently manages any ultrasound nature, any anatomical position, any input type and excelling not only in segmentation tasks but also in classification tasks. We introduce a novel module that incorporates this information as a prompt and seamlessly embedding it within the model's learning process. To train and validate our proposed model, we curated a comprehensive ultrasound dataset from publicly accessible sources, encompassing up to 7 distinct anatomical positions with over 9.7K annotations. Experimental results demonstrate that our model achieves performance comparable to state-of-the-art models, and surpasses both a model trained on a single dataset and an ablated version of the network lacking prompt guidance. Additionally, we conducted zero-shot and fine-tuning experiments on new datasets, which proved that our model possesses strong generalization capabilities and can be effectively adapted to new data at low cost through its adapter module. We will continuously expand the dataset and optimize the task specific prompting mechanism towards the universality in medical ultrasound. Model weights, data processing workflows, and code will be open source to the public (https://github.com/Zehui-Lin/UniUSNet).

Related papers

GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models [2.089191490381739]
We propose a prompt-driven vision-language model (VLM) that integrates Grounding DINO with SAM2 to enable object segmentation across multiple ultrasound organs.<n>A total of 18 public ultrasound datasets, encompassing the breast, thyroid, liver, prostate, kidney, and paraspinal muscle, were utilized.<n>Our approach outperforms state-of-the-art segmentation methods, including UniverSeg, MedSAM, MedCLIP-SAM, BiomedParse, and SAMUS.
arXiv Detail & Related papers (2025-06-30T14:33:44Z)
Interpreting Biomedical VLMs on High-Imbalance Out-of-Distributions: An Insight into BiomedCLIP on Radiology [0.0]
We analyse the limitations of BiomedCLIP when applied to a highly imbalanced, out-of-distribution medical dataset.<n>We show that the model under zero-shot settings over-predicts all labels, leading to poor precision and inter-class separability.<n>We highlight the need for careful adaptations of the models to foster reliability and applicability in a real-world setting.
arXiv Detail & Related papers (2025-06-17T02:59:42Z)
CAVE-Net: Classifying Abnormalities in Video Capsule Endoscopy [0.1937002985471497]
We propose an ensemble-based approach to improve diagnostic accuracy in analyzing complex image datasets. We leverage the unique feature extraction capabilities of each model to enhance the overall accuracy. By using these methods, the proposed framework, CAVE-Net, provides robust feature discrimination and improved classification results.
arXiv Detail & Related papers (2024-10-26T17:25:08Z)
UNICORN: A Deep Learning Model for Integrating Multi-Stain Data in Histopathology [2.9389205138207277]
UNICORN is a multi-modal transformer capable of processing multi-stain histopathology for atherosclerosis severity class prediction. The architecture comprises a two-stage, end-to-end trainable model with specialized modules utilizing transformer self-attention blocks. UNICORN achieved a classification accuracy of 0.67, outperforming other state-of-the-art models.
arXiv Detail & Related papers (2024-09-26T12:13:52Z)
A novel open-source ultrasound dataset with deep learning benchmarks for spinal cord injury localization and anatomical segmentation [1.02101998415327]
We present an ultrasound dataset of 10,223-mode (B-mode) images consisting of sagittal slices of porcine spinal cords. We benchmark the performance metrics of several state-of-the-art object detection algorithms to localize the site of injury. We evaluate the zero-shot generalization capabilities of the segmentation models on human ultrasound spinal cord images.
arXiv Detail & Related papers (2024-09-24T20:22:59Z)
Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development [59.74920439478643]
In this paper, we collect and annotated the first benchmark dataset that covers diverse ERUS scenarios. Our ERUS-10K dataset comprises 77 videos and 10,000 high-resolution annotated frames. We introduce a benchmark model for colorectal cancer segmentation, named the Adaptive Sparse-context TRansformer (ASTR)
arXiv Detail & Related papers (2024-08-19T15:04:42Z)
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography [50.08496922659307]
We propose a universal framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes. Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models. Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors.
arXiv Detail & Related papers (2024-05-28T16:55:15Z)
WATUNet: A Deep Neural Network for Segmentation of Volumetric Sweep Imaging Ultrasound [1.2903292694072621]
Volume sweep imaging (VSI) is an innovative approach that enables untrained operators to capture quality ultrasound images. We present a novel segmentation model known as Wavelet_Attention_UNet (WATUNet) In this model, we incorporate wavelet gates (WGs) and attention gates (AGs) between the encoder and decoder instead of a simple connection to overcome the limitations mentioned.
arXiv Detail & Related papers (2023-11-17T20:32:37Z)
Detecting Speech Abnormalities with a Perceiver-based Sequence Classifier that Leverages a Universal Speech Model [4.503292461488901]
We propose a Perceiver-based sequence to detect abnormalities in speech reflective of several neurological disorders. We combine this sequence with a Universal Speech Model (USM) that is trained (unsupervised) on 12 million hours of diverse audio recordings. Our model outperforms standard transformer (80.9%) and perceiver (81.8%) models and achieves an average accuracy of 83.1%.
arXiv Detail & Related papers (2023-10-16T21:07:12Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection [36.08551407926805]
We propose the CLIP-Driven Universal Model, which incorporates text embedding learned from Contrastive Language-Image Pre-training to segmentation models. The proposed model is developed from an assembly of 14 datasets, using a total of 3,410 CT scans for training and then evaluated on 6,162 external CT scans from 3 additional datasets.
arXiv Detail & Related papers (2023-01-02T18:07:44Z)
Factored Attention and Embedding for Unstructured-view Topic-related Ultrasound Report Generation [70.7778938191405]
We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation. The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
arXiv Detail & Related papers (2022-03-12T15:24:03Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
Weakly supervised multiple instance learning histopathological tumor segmentation [51.085268272912415]
We propose a weakly supervised framework for whole slide imaging segmentation. We exploit a multiple instance learning scheme for training models. The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset.
arXiv Detail & Related papers (2020-04-10T13:12:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.