Related papers: Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

URL: http://arxiv.org/abs/2505.16773v1
Date: Thu, 22 May 2025 15:15:17 GMT
Title: Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis
Authors: Iván Matas, Carmen Serrano, Miguel Nogales, David Moreno, Lara Ferrándiz, Teresa Ojeda, Begoña Acha,
Abstract summary: This study introduces an unsupervised learning framework that extracts high-value dermatological features.<n>We employ a Variational Autoencoder trained from scratch on a proprietary dermatological dataset.<n>Self-supervised learning achieves steady improvements, stronger generalization, and superior adaptability.
Score: 0.5025737475817937
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning has transformed computer vision but relies heavily on large labeled datasets and computational resources. Transfer learning, particularly fine-tuning pretrained models, offers a practical alternative; however, models pretrained on natural image datasets such as ImageNet may fail to capture domain-specific characteristics in medical imaging. This study introduces an unsupervised learning framework that extracts high-value dermatological features instead of relying solely on ImageNet-based pretraining. We employ a Variational Autoencoder (VAE) trained from scratch on a proprietary dermatological dataset, allowing the model to learn a structured and clinically relevant latent space. This self-supervised feature extractor is then compared to an ImageNet-pretrained backbone under identical classification conditions, highlighting the trade-offs between general-purpose and domain-specific pretraining. Our results reveal distinct learning patterns. The self-supervised model achieves a final validation loss of 0.110 (-33.33%), while the ImageNet-pretrained model stagnates at 0.100 (-16.67%), indicating overfitting. Accuracy trends confirm this: the self-supervised model improves from 45% to 65% (+44.44%) with a near-zero overfitting gap, whereas the ImageNet-pretrained model reaches 87% (+50.00%) but plateaus at 75% (+19.05%), with its overfitting gap increasing to +0.060. These findings suggest that while ImageNet pretraining accelerates convergence, it also amplifies overfitting on non-clinically relevant features. In contrast, self-supervised learning achieves steady improvements, stronger generalization, and superior adaptability, underscoring the importance of domain-specific feature extraction in medical imaging.

Related papers

Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification [0.0]
We first pre-train the model with self-supervision to enable it to learn common feature expressions on a large amount of unlabeled data. Then fine-tune it on the few-shot dataset Mini-ImageNet to improve the model's accuracy and generalization ability under limited data.
arXiv Detail & Related papers (2024-11-19T01:01:56Z)
Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.<n>This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z)
MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks [6.8948885302235325]
We introduce visual prompting (VP) to fill in the gap between input medical images and natural pretrained vision model. We design a joint learning loss function containing categorisation loss and discrepancy loss, which describe the variance of prompted and plain images. On out of distribution medical dataset, our method(90.33%) can outperform FF (85.15%) with absolute 5.18 % lead.
arXiv Detail & Related papers (2024-05-13T01:18:25Z)
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets. In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters. We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z)
Performance of GAN-based augmentation for deep learning COVID-19 image classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data. Data augmentation is a typical methodology used in machine learning when confronted with a limited data set. In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z)
Revisiting Hidden Representations in Transfer Learning for Medical Imaging [2.4545492329339815]
We compare ImageNet and RadImageNet on seven medical classification tasks. Our results indicate that, contrary to intuition, ImageNet and RadImageNet may converge to distinct intermediate representations. Our findings show that the similarity between networks before and after fine-tuning does not correlate with performance gains.
arXiv Detail & Related papers (2023-02-16T13:04:59Z)
Significantly improving zero-shot X-ray pathology classification via fine-tuning pre-trained image-text encoders [50.689585476660554]
We propose a new fine-tuning strategy that includes positive-pair loss relaxation and random sentence sampling. Our approach consistently improves overall zero-shot pathology classification across four chest X-ray datasets and three pre-trained models.
arXiv Detail & Related papers (2022-12-14T06:04:18Z)
Self-Supervised Learning as a Means To Reduce the Need for Labeled Data in Medical Image Analysis [64.4093648042484]
We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies. We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60% of the labeled data.
arXiv Detail & Related papers (2022-06-01T09:20:30Z)
Application of Transfer Learning and Ensemble Learning in Image-level Classification for Breast Histopathology [9.037868656840736]
In Computer-Aided Diagnosis (CAD), traditional classification models mostly use a single network to extract features. This paper proposes a deep ensemble model based on image-level labels for the binary classification of benign and malignant lesions. Result: In the ensemble network model with accuracy as the weight, the image-level binary classification achieves an accuracy of $98.90%$.
arXiv Detail & Related papers (2022-04-18T13:31:53Z)
Core Risk Minimization using Salient ImageNet [53.616101711801484]
We introduce the Salient Imagenet dataset with more than 1 million soft masks localizing core and spurious features for all 1000 Imagenet classes. Using this dataset, we first evaluate the reliance of several Imagenet pretrained models (42 total) on spurious features. Next, we introduce a new learning paradigm called Core Risk Minimization (CoRM) whose objective ensures that the model predicts a class using its core features.
arXiv Detail & Related papers (2022-03-28T01:53:34Z)
Intelligent Masking: Deep Q-Learning for Context Encoding in Medical Image Analysis [48.02011627390706]
We develop a novel self-supervised approach that occludes targeted regions to improve the pre-training procedure. We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
arXiv Detail & Related papers (2022-03-25T19:05:06Z)
On the Robustness of Pretraining and Self-Supervision for a Deep Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading. We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions. Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.