UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation
- URL: http://arxiv.org/abs/2406.01154v2
- Date: Fri, 21 Jun 2024 02:22:56 GMT
- Title: UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation
- Authors: Zehui Lin, Zhuoneng Zhang, Xindi Hu, Zhifan Gao, Xin Yang, Yue Sun, Dong Ni, Tao Tan,
- Abstract summary: We propose a novel universal framework for ultrasound, namely UniUSNet.
UniUSNet is a promptable framework for ultrasound image classification and segmentation.
We train and validate our proposed model, and surpass both a model trained on a single dataset and an ablated version of the network lacking prompt guidance.
- Score: 19.85119434049726
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ultrasound is a widely used imaging modality in clinical practice due to its low cost, portability, and safety. Current research in general AI for healthcare focuses on large language models and general segmentation models, with insufficient attention to solutions addressing both disease prediction and tissue segmentation. In this study, we propose a novel universal framework for ultrasound, namely UniUSNet, which is a promptable framework for ultrasound image classification and segmentation. The universality of this model is derived from its versatility across various aspects. It proficiently manages any ultrasound nature, any anatomical position, any input type and excelling not only in segmentation tasks but also in classification tasks. We introduce a novel module that incorporates this information as a prompt and seamlessly embedding it within the model's learning process. To train and validate our proposed model, we curated a comprehensive ultrasound dataset from publicly accessible sources, encompassing up to 7 distinct anatomical positions with over 9.7K annotations. Experimental results demonstrate that our model achieves performance comparable to state-of-the-art models, and surpasses both a model trained on a single dataset and an ablated version of the network lacking prompt guidance. Additionally, we conducted zero-shot and fine-tuning experiments on new datasets, which proved that our model possesses strong generalization capabilities and can be effectively adapted to new data at low cost through its adapter module. We will continuously expand the dataset and optimize the task specific prompting mechanism towards the universality in medical ultrasound. Model weights, data processing workflows, and code will be open source to the public (https://github.com/Zehui-Lin/UniUSNet).
Related papers
- Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography [50.08496922659307]
We propose a universal framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes.
Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models.
Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors.
arXiv Detail & Related papers (2024-05-28T16:55:15Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - VISION-MAE: A Foundation Model for Medical Image Segmentation and
Classification [36.8105960525233]
We present a novel foundation model, VISION-MAE, specifically designed for medical imaging.
VISION-MAE is trained on a dataset of 2.5 million unlabeled images from various modalities.
It is then adapted to classification and segmentation tasks using explicit labels.
arXiv Detail & Related papers (2024-02-01T21:45:12Z) - Generalizing Medical Image Representations via Quaternion Wavelet
Networks [10.745453748351219]
We introduce a novel, generalizable, data- and task-agnostic framework able to extract salient features from medical images.
The proposed quaternion wavelet network (QUAVE) can be easily integrated with any pre-existing medical image analysis or synthesis task.
arXiv Detail & Related papers (2023-10-16T09:34:06Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Medical Image Segmentation Review: The success of U-Net [12.599426601722316]
U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities.
Several extensions of this network have been proposed to address the scale and complexity created by medical tasks.
We discuss the practical aspects of the U-Net model and suggest a taxonomy to categorize each network variant.
arXiv Detail & Related papers (2022-11-27T13:52:33Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.