Rethinking Transfer Learning for Medical Image Classification
- URL: http://arxiv.org/abs/2106.05152v8
- Date: Sun, 26 May 2024 19:45:01 GMT
- Title: Rethinking Transfer Learning for Medical Image Classification
- Authors: Le Peng, Hengyue Liang, Gaoxiang Luo, Taihui Li, Ju Sun,
- Abstract summary: Transfer learning (TL) from pretrained deep models is a standard practice in modern medical image classification (MIC)
In this paper, we add one more strategy into this family, called TruncatedTL, which reuses and finetunes appropriate bottom layers and directly discards the remaining layers.
This yields not only superior MIC performance but also compact models for efficient inference, compared to other differential TL methods.
- Score: 2.9161778726049525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning (TL) from pretrained deep models is a standard practice in modern medical image classification (MIC). However, what levels of features to be reused are problem-dependent, and uniformly finetuning all layers of pretrained models may be suboptimal. This insight has partly motivated the recent differential TL strategies, such as TransFusion (TF) and layer-wise finetuning (LWFT), which treat the layers in the pretrained models differentially. In this paper, we add one more strategy into this family, called TruncatedTL, which reuses and finetunes appropriate bottom layers and directly discards the remaining layers. This yields not only superior MIC performance but also compact models for efficient inference, compared to other differential TL methods. Our code is available at: https://github.com/sun-umn/TTL
Related papers
- Classifier-guided Gradient Modulation for Enhanced Multimodal Learning [50.7008456698935]
Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients.
We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS.
CGGM outperforms all the baselines and other state-of-the-art methods consistently.
arXiv Detail & Related papers (2024-11-03T02:38:43Z) - SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery [54.866490321241905]
Model merging-based multitask learning (MTL) offers a promising approach for performing MTL by merging multiple expert models.
In this paper, we examine the merged model's representation distribution and uncover a critical issue of "representation bias"
This bias arises from a significant distribution gap between the representations of the merged and expert models, leading to the suboptimal performance of the merged MTL model.
arXiv Detail & Related papers (2024-10-18T11:49:40Z) - Chip-Tuning: Classify Before Language Models Say [25.546473157624945]
Chip-tuning is a simple and effective structured pruning framework for classification problems.
We show that chip-tuning significantly outperforms previous state-of-the-art baselines in both accuracy and pruning ratio.
We also find that chip-tuning could be applied on multimodal models, and could be combined with model finetuning, proving its excellent compatibility.
arXiv Detail & Related papers (2024-10-09T04:35:22Z) - Inheritune: Training Smaller Yet More Attentive Language Models [61.363259848264725]
Inheritune is a simple yet effective training recipe for developing smaller, high-performing language models.
We demonstrate that Inheritune enables the training of various sizes of GPT-2 models on datasets like OpenWebText-9B and FineWeb_edu.
arXiv Detail & Related papers (2024-04-12T17:53:34Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Surgical Fine-Tuning Improves Adaptation to Distribution Shifts [114.17184775397067]
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model.
This paper shows that in such settings, selectively fine-tuning a subset of layers matches or outperforms commonly used fine-tuning approaches.
arXiv Detail & Related papers (2022-10-20T17:59:15Z) - Slimmable Networks for Contrastive Self-supervised Learning [69.9454691873866]
Self-supervised learning makes significant progress in pre-training large models, but struggles with small models.
We introduce another one-stage solution to obtain pre-trained small models without the need for extra teachers.
A slimmable network consists of a full network and several weight-sharing sub-networks, which can be pre-trained once to obtain various networks.
arXiv Detail & Related papers (2022-09-30T15:15:05Z) - Multi-layer Clustering-based Residual Sparsifying Transform for Low-dose
CT Image Reconstruction [11.011268090482575]
We propose a network-structured sparsifying transform learning approach for X-ray computed tomography (CT) reconstruction.
We apply the MCST model to low-dose CT reconstruction by deploying the learned MCST model into the regularizer in penalized weighted least squares (PWLS) reconstruction.
Our simulation results demonstrate that PWLS-MCST achieves better image reconstruction quality than the conventional FBP method and PWLS with edge-preserving (EP) regularizer.
arXiv Detail & Related papers (2022-03-22T09:38:41Z) - Multi-layer Residual Sparsifying Transform (MARS) Model for Low-dose CT
Image Reconstruction [12.37556184089774]
We develop a new image reconstruction approach based on a novel multi-layer model learned in an unsupervised manner.
The proposed framework extends the classical sparsifying transform model for images to a Multi-lAyer Residual Sparsifying transform (MARS) model.
We derive an efficient block coordinate descent algorithm to learn the transforms across layers, in an unsupervised manner from limited regular-dose images.
arXiv Detail & Related papers (2020-10-10T09:04:43Z) - Learned Multi-layer Residual Sparsifying Transform Model for Low-dose CT
Reconstruction [11.470070927586017]
Sparsifying transform learning involves highly efficient sparse coding and operator update steps.
We propose a Multi-layer Residual Sparsifying Transform (MRST) learning model wherein the transform domain residuals are jointly sparsified over layers.
arXiv Detail & Related papers (2020-05-08T02:36:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.