Visual-Attribute Prompt Learning for Progressive Mild Cognitive
Impairment Prediction
- URL: http://arxiv.org/abs/2310.14158v1
- Date: Sun, 22 Oct 2023 02:49:53 GMT
- Title: Visual-Attribute Prompt Learning for Progressive Mild Cognitive
Impairment Prediction
- Authors: Luoyao Kang and Haifan Gong and Xiang Wan and Haofeng Li
- Abstract summary: We propose a transformer-based network that efficiently extracts and fuses the multi-modal features with prompt fine-tuning.
In details, we first pre-train the VAP-Former without prompts on the AD diagnosis task and then fine-tune the model on the pMCI detection task with PT.
Next, we propose a novel global prompt token for the visual prompts to provide global guidance to the multi-modal representations.
- Score: 27.261602207491244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) has been used in the automatic diagnosis of Mild Cognitive
Impairment (MCI) and Alzheimer's Disease (AD) with brain imaging data. However,
previous methods have not fully exploited the relation between brain image and
clinical information that is widely adopted by experts in practice. To exploit
the heterogeneous features from imaging and tabular data simultaneously, we
propose the Visual-Attribute Prompt Learning-based Transformer (VAP-Former), a
transformer-based network that efficiently extracts and fuses the multi-modal
features with prompt fine-tuning. Furthermore, we propose a Prompt fine-Tuning
(PT) scheme to transfer the knowledge from AD prediction task for progressive
MCI (pMCI) diagnosis. In details, we first pre-train the VAP-Former without
prompts on the AD diagnosis task and then fine-tune the model on the pMCI
detection task with PT, which only needs to optimize a small amount of
parameters while keeping the backbone frozen. Next, we propose a novel global
prompt token for the visual prompts to provide global guidance to the
multi-modal representations. Extensive experiments not only show the
superiority of our method compared with the state-of-the-art methods in pMCI
prediction but also demonstrate that the global prompt can make the prompt
learning process more effective and stable. Interestingly, the proposed prompt
learning model even outperforms the fully fine-tuning baseline on transferring
the knowledge from AD to pMCI.
Related papers
- PGAD: Prototype-Guided Adaptive Distillation for Multi-Modal Learning in AD Diagnosis [4.455792848101014]
Missing modalities pose a major issue in Alzheimer's Disease (AD) diagnosis.
Most existing methods train only on complete data, ignoring the large proportion of incomplete samples in real-world datasets like ADNI.
We propose a Prototype-Guided Adaptive Distillation (PGAD) framework that directly incorporates incomplete multi-modal data into training.
arXiv Detail & Related papers (2025-03-05T14:39:31Z) - Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models [30.044545011553172]
This paper proposes Brain-Adapter, a novel approach that incorporates an extra bottleneck layer to learn new knowledge and instill it into the original pre-trained knowledge.
Experiments demonstrated the effectiveness of our approach in integrating multimodal data to significantly improve the diagnosis accuracy without high computational costs.
arXiv Detail & Related papers (2025-01-27T18:20:49Z) - Adversarial Prompt Distillation for Vision-Language Models [25.07001647341082]
Large pre-trained Vision-Language Models (VLMs) have been shown to be susceptible to adversarial attacks.
One promising approach for improving the robustness of pre-trained VLMs is Adversarial Prompt Tuning (APT)
We propose a novel method called Adversarial Prompt Distillation (APD) that combines APT with knowledge distillation to boost the adversarial robustness of CLIP.
arXiv Detail & Related papers (2024-11-22T03:02:13Z) - MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder [26.830574964308962]
We introduce MedFLIP, a Fast Language-Image Pre-training method for Medical analysis.
We explore MAEs for zero-shot learning with crossed domains, which enhances the model's ability to learn from limited data.
Lastly, we validate using language will improve the zero-shot performance for the medical image analysis.
arXiv Detail & Related papers (2024-03-07T16:11:43Z) - Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning [13.964106147449051]
Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets.
We propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT)
We demonstrate that our new approximations with semantic information are superior to representative capabilities.
arXiv Detail & Related papers (2024-02-04T04:42:05Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Diffusion Model as Representation Learner [86.09969334071478]
Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive results on various generative tasks.
We propose a novel knowledge transfer method that leverages the knowledge acquired by DPMs for recognition tasks.
arXiv Detail & Related papers (2023-08-21T00:38:39Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - About Explicit Variance Minimization: Training Neural Networks for
Medical Imaging With Limited Data Annotations [2.3204178451683264]
Variance Aware Training (VAT) method exploits this property by introducing the variance error into the model loss function.
We validate VAT on three medical imaging datasets from diverse domains and various learning objectives.
arXiv Detail & Related papers (2021-05-28T21:34:04Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - Automatic Data Augmentation via Deep Reinforcement Learning for
Effective Kidney Tumor Segmentation [57.78765460295249]
We develop a novel automatic learning-based data augmentation method for medical image segmentation.
In our method, we innovatively combine the data augmentation module and the subsequent segmentation module in an end-to-end training manner with a consistent loss.
We extensively evaluated our method on CT kidney tumor segmentation which validated the promising results of our method.
arXiv Detail & Related papers (2020-02-22T14:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.