Less Could Be Better: Parameter-efficient Fine-tuning Advances Medical
Vision Foundation Models
- URL: http://arxiv.org/abs/2401.12215v1
- Date: Mon, 22 Jan 2024 18:59:07 GMT
- Title: Less Could Be Better: Parameter-efficient Fine-tuning Advances Medical
Vision Foundation Models
- Authors: Chenyu Lian, Hong-Yu Zhou, Yizhou Yu, Liansheng Wang
- Abstract summary: The effectiveness of PEFT on medical vision foundation models is still unclear.
We set up new state-of-the-art on a range of data-efficient learning tasks, such as an AUROC score of 80.6% using 1% labeled data on NIH ChestX-ray14.
We hope this study can evoke more attention from the community in the use of PEFT for transfer learning on medical imaging tasks.
- Score: 71.18275399694689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter-efficient fine-tuning (PEFT) that was initially developed for
exploiting pre-trained large language models has recently emerged as an
effective approach to perform transfer learning on computer vision tasks.
However, the effectiveness of PEFT on medical vision foundation models is still
unclear and remains to be explored. As a proof of concept, we conducted a
detailed empirical study on applying PEFT to chest radiography foundation
models. Specifically, we delved into LoRA, a representative PEFT method, and
compared it against full-parameter fine-tuning (FFT) on two self-supervised
radiography foundation models across three well-established chest radiograph
datasets. Our results showed that LoRA outperformed FFT in 13 out of 18
transfer learning tasks by at most 2.9% using fewer than 1% tunable parameters.
Combining LoRA with foundation models, we set up new state-of-the-art on a
range of data-efficient learning tasks, such as an AUROC score of 80.6% using
1% labeled data on NIH ChestX-ray14. We hope this study can evoke more
attention from the community in the use of PEFT for transfer learning on
medical imaging tasks. Code and models are available at
https://github.com/RL4M/MED-PEFT.
Related papers
- Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.
This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z) - Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging [1.533133219129073]
Data scarcity is a major limiting factor for applying modern machine learning techniques to clinical tasks.
We conducted a benchmark study of few-shot learning and zero-shot learning using 16 pretrained foundation models on 19 diverse medical imaging datasets.
Our results indicate that BiomedCLIP, a model pretrained exclusively on medical data, performs best on average for very small training set sizes.
arXiv Detail & Related papers (2024-08-15T09:55:51Z) - Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification [16.070261684997362]
Fine-tuning pre-trained models for various downstream tasks is a critical problem in the medical imaging domain.
Large size of these models necessitates the use of parameter-efficient fine-tuning (PEFT) to reduce the communication burden in federated learning.
In this work, we investigate various federated PEFT strategies for adapting a Vision Transformer (ViT) model for medical image classification.
arXiv Detail & Related papers (2024-07-16T10:28:50Z) - Robust and Explainable Framework to Address Data Scarcity in Diagnostic Imaging [6.744847405966574]
We introduce a novel ensemble framework called Efficient Transfer and Self-supervised Learning based Ensemble Framework' (ETSEF)
ETSEF leverages features from multiple pre-trained deep learning models to efficiently learn powerful representations from a limited number of data samples.
Five independent medical imaging tasks, including endoscopy, breast cancer, monkeypox, brain tumour, and glaucoma detection, were tested to demonstrate ETSEF's effectiveness and robustness.
arXiv Detail & Related papers (2024-07-09T05:48:45Z) - ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts [52.1635661239108]
We introduce ExPLoRA, a highly effective technique to improve transfer learning of pre-trained vision transformers (ViTs) under domain shifts.
Our experiments demonstrate state-of-the-art results on satellite imagery, even outperforming fully pre-training and fine-tuning ViTs.
arXiv Detail & Related papers (2024-06-16T15:14:56Z) - ReFT: Representation Finetuning for Language Models [74.51093640257892]
We develop a family of Representation Finetuning (ReFT) methods.
ReFTs operate on a frozen base model and learn task-specific interventions on hidden representations.
We showcase LoReFT on eight commonsense reasoning tasks, four arithmetic reasoning tasks, instruction-tuning, and GLUE.
arXiv Detail & Related papers (2024-04-04T17:00:37Z) - BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease
Diagnosis [42.917164607812886]
Vision-and-language(V&L) models take image and text as input and learn to capture the associations between them.
BERTHop is a transformer-based model based on PixelHop++ and VisualBERT, for better capturing the associations between the two modalities.
arXiv Detail & Related papers (2021-08-10T21:51:25Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - Classification of COVID-19 in CT Scans using Multi-Source Transfer
Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans.
With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet.
Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.