EAFP-Med: An Efficient Adaptive Feature Processing Module Based on
Prompts for Medical Image Detection
- URL: http://arxiv.org/abs/2311.15540v1
- Date: Mon, 27 Nov 2023 05:10:15 GMT
- Title: EAFP-Med: An Efficient Adaptive Feature Processing Module Based on
Prompts for Medical Image Detection
- Authors: Xiang Li, Long Lan, Husam Lahza, Shaowu Yang, Shuihua Wang, Wenjing
Yang, Hengzhu Liu, Yudong Zhang
- Abstract summary: Cross-domain adaptive medical image detection is challenging due to the differences in lesion representations across various medical imaging technologies.
We propose EAFP-Med, an efficient adaptive feature processing module based on prompts for medical image detection.
EAFP-Med can efficiently extract lesion features from various medical images based on prompts, enhancing the model's performance.
- Score: 27.783012550610387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the face of rapid advances in medical imaging, cross-domain adaptive
medical image detection is challenging due to the differences in lesion
representations across various medical imaging technologies. To address this
issue, we draw inspiration from large language models to propose EAFP-Med, an
efficient adaptive feature processing module based on prompts for medical image
detection. EAFP-Med can efficiently extract lesion features of different scales
from a diverse range of medical images based on prompts while being flexible
and not limited by specific imaging techniques. Furthermore, it serves as a
feature preprocessing module that can be connected to any model front-end to
enhance the lesion features in input images. Moreover, we propose a novel
adaptive disease detection model named EAFP-Med ST, which utilizes the Swin
Transformer V2 - Tiny (SwinV2-T) as its backbone and connects it to EAFP-Med.
We have compared our method to nine state-of-the-art methods. Experimental
results demonstrate that EAFP-Med ST achieves the best performance on all three
datasets (chest X-ray images, cranial magnetic resonance imaging images, and
skin images). EAFP-Med can efficiently extract lesion features from various
medical images based on prompts, enhancing the model's performance. This holds
significant potential for improving medical image analysis and diagnosis.
Related papers
- Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model [2.507050016527729]
Tri-modal medical image fusion can provide a more comprehensive view of the disease's shape, location, and biological activity.
Due to the limitations of imaging equipment and considerations for patient safety, the quality of medical images is usually limited.
There is an urgent need for a technology that can both enhance image resolution and integrate multi-modal information.
arXiv Detail & Related papers (2024-04-26T12:13:41Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - DDPM based X-ray Image Synthesizer [0.0]
We propose a Denoising Diffusion Probabilistic Model (DDPM) combined with a UNet architecture for X-ray image synthesis.
Our methodology employs over 3000 pneumonia X-ray images obtained from Kaggle for training.
Results demonstrate the effectiveness of our approach, as the model successfully generated realistic images with low Mean Squared Error (MSE)
arXiv Detail & Related papers (2024-01-03T04:35:58Z) - FeaInfNet: Diagnosis in Medical Image with Feature-Driven Inference and
Visual Explanations [4.022446255159328]
Interpretable deep learning models have received widespread attention in the field of image recognition.
Many interpretability models that have been proposed still have problems of insufficient accuracy and interpretability in medical image disease diagnosis.
We propose feature-driven inference network (FeaInfNet) to solve these problems.
arXiv Detail & Related papers (2023-12-04T13:09:00Z) - Unified Medical Image Pre-training in Language-Guided Common Semantic Space [39.61770813855078]
We propose an Unified Medical Image Pre-training framework, namely UniMedI.
UniMedI uses diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images.
We evaluate its performance on both 2D and 3D images across 10 different datasets.
arXiv Detail & Related papers (2023-11-24T22:01:12Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Customizing General-Purpose Foundation Models for Medical Report
Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks.
We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.