Efficient Fine-Tuning of DINOv3 Pretrained on Natural Images for Atypical Mitotic Figure Classification in MIDOG 2025
- URL: http://arxiv.org/abs/2508.21041v1
- Date: Thu, 28 Aug 2025 17:45:22 GMT
- Title: Efficient Fine-Tuning of DINOv3 Pretrained on Natural Images for Atypical Mitotic Figure Classification in MIDOG 2025
- Authors: Guillaume Balezo, Raphaƫl Bourgade, Thomas Walter,
- Abstract summary: Atypical mitotic figures (AMFs) are markers of abnormal cell division associated with poor prognosis.<n>The MIDOG 2025 challenge introduces a benchmark for AMF classification across multiple domains.<n>We evaluate the recently published DINOv3-H+ vision transformer, pretrained on natural images.
- Score: 1.7259725776748482
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Atypical mitotic figures (AMFs) are markers of abnormal cell division associated with poor prognosis, yet their detection remains difficult due to low prevalence, subtle morphology, and inter-observer variability. The MIDOG 2025 challenge introduces a benchmark for AMF classification across multiple domains. In this work, we evaluate the recently published DINOv3-H+ vision transformer, pretrained on natural images, which we fine-tuned using low-rank adaptation (LoRA, 650k trainable parameters) and extensive augmentation. Despite the domain gap, DINOv3 transfers effectively to histopathology, achieving a balanced accuracy of 0.8871 on the preliminary test set. These results highlight the robustness of DINOv3 pretraining and show that, when combined with parameter-efficient fine-tuning, it provides a strong baseline for atypical mitosis classification in MIDOG 2025.
Related papers
- Challenging DINOv3 Foundation Model under Low Inter-Class Variability: A Case Study on Fetal Brain Ultrasound [4.07447364754644]
This study provides the first comprehensive evaluation of foundation models in fetal ultrasound (US) imaging under low interclass variability conditions.<n>We focus on fetal brain standard planes--transthalamic (TT), transventricular (TV), and transcerebellar (TC)--which exhibit highly overlapping anatomical features.<n>Models pretrained on fetal ultrasound data consistently outperformed those on natural images, with weighted F1-score improvements of up to 20 percent.
arXiv Detail & Related papers (2025-11-01T13:37:22Z) - Parameter-efficient fine-tuning (PEFT) of Vision Foundation Models for Atypical Mitotic Figure Classification [0.0]
Atypical mitotic figures (AMFs) are rare abnormal cell divisions associated with tumor aggressiveness and poor prognosis.<n>The MID 2025OG challenge introduced a dedicated track for atypical mitosis classification.<n>We investigated the use of large vision foundation models, including Virchow, Virchow2, and UNI, with Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning.
arXiv Detail & Related papers (2025-09-21T05:46:54Z) - Adaptive Learning Strategies for Mitotic Figure Classification in MIDOG2025 Challenge [7.3323821474776]
We investigated three variants of adapting the pathology foundation model UNI2 for the MIDOG2025 Track 2 challenge.<n>We observed that the integration of Visual Prompt Tuning (VPT) with stain normalization techniques contributed to improved generalization.<n>Our final submission achieved a balanced accuracy of 0.8837 and an ROC-AUC of 0.9513 on the preliminary leaderboard, ranking within the top 10 teams.
arXiv Detail & Related papers (2025-09-01T22:42:53Z) - ConvNeXt with Histopathology-Specific Augmentations for Mitotic Figure Classification [1.398256265458105]
We propose a solution based on the lightweight ConvNeXt architecture to maximize domain coverage.<n>On the preliminary leaderboard, our model achieved a balanced accuracy of 0.8961, ranking among the top entries.
arXiv Detail & Related papers (2025-08-29T13:18:32Z) - Ensemble of Pathology Foundation Models for MIDOG 2025 Track 2: Atypical Mitosis Classification [0.0]
We leveraged Pathology Foundation Models (PFMs) pre-trained on large histopathology datasets.<n>We incorporated ConvNeXt V2, a state-of-the-art convolutional neural network architecture, to complement PFMs.<n>We ensembled multiple PFMs to integrate complementary morphological insights, achieving balanced accuracy on the Preliminary Evaluation Phase dataset.
arXiv Detail & Related papers (2025-08-29T03:24:57Z) - Mix, Align, Distil: Reliable Cross-Domain Atypical Mitosis Classification [5.484561603970499]
We present a simple training-time recipe for domain-robust AMF classification in MIDOG 2025 Task 2.<n>Our submission attains balanced accuracy of 0.8762, sensitivity of 0.8873, specificity of 0.8651, and ROC AUC of 0.9499.
arXiv Detail & Related papers (2025-08-28T13:04:55Z) - A bag of tricks for real-time Mitotic Figure detection [0.0]
We build on the efficient RTMDet single stage object detector to achieve high inference speed suitable for clinical deployment.<n>We employ targeted, hard negative mining on necrotic and debris tissue to reduce false positives.<n>On the preliminary test set of the MItosis DOmain Generalization (MIDOG) 2025 challenge, our single-stage RTMDet-S based approach reaches an F1 of 0.81.
arXiv Detail & Related papers (2025-08-27T11:45:44Z) - Multi-Attention Stacked Ensemble for Lung Cancer Detection in CT Scans [3.8121150313479655]
Three pretrained backbones are adapted with a custom classification head tailored to 96 x 96 pixel inputs.<n>A two-stage attention mechanism learns both model-wise and class-wise importance scores from logits.<n>Experiments on the LIDC-IDRI dataset demonstrate exceptional performance, achieving 98.09 accuracy and 0.9961 AUC.
arXiv Detail & Related papers (2025-07-27T11:03:07Z) - Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning [51.525891360380285]
HDMIL is a hierarchical distillation multi-instance learning framework that achieves fast and accurate classification by eliminating irrelevant patches.<n> HDMIL consists of two key components: the dynamic multi-instance network (DMIN) and the lightweight instance pre-screening network (LIPN)
arXiv Detail & Related papers (2025-02-28T15:10:07Z) - Is an Ultra Large Natural Image-Based Foundation Model Superior to a Retina-Specific Model for Detecting Ocular and Systemic Diseases? [15.146396276161937]
RETFound and DINOv2 models were evaluated for ocular disease detection and systemic disease prediction tasks.<n> RETFound achieved superior performance over all DINOv2 models in predicting heart failure, infarction, and ischaemic stroke.
arXiv Detail & Related papers (2025-02-10T09:31:39Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic
Image Classification [61.656149405657246]
Domain adaptation is effective in image classification tasks where obtaining sufficient label data is challenging.
We propose a novel method, named SELDA, for stacking ensemble learning via extending three domain adaptation methods.
The experimental results using Age-Related Eye Disease Study (AREDS) benchmark ophthalmic dataset demonstrate the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-27T14:19:00Z) - Multi-Scale Input Strategies for Medulloblastoma Tumor Classification
using Deep Transfer Learning [59.30734371401316]
Medulloblastoma is the most common malignant brain cancer among children.
CNN has shown promising results for MB subtype classification.
We study the impact of tile size and input strategy.
arXiv Detail & Related papers (2021-09-14T09:42:37Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Explainability Guided Multi-Site COVID-19 CT Classification [79.4957965474334]
The limited number of supervised positive cases, the lack of region-based supervision, and the variability across acquisition sites are addressed.
Compared to the current state of the art, we obtain an increase of five percent in the F1 score on a site with a relatively high number of cases, and a gap twice as large for a site with much fewer training images.
arXiv Detail & Related papers (2021-03-25T08:56:08Z) - Bilateral Asymmetry Guided Counterfactual Generating Network for
Mammogram Classification [48.4619620405991]
Mammogram benign or malignant classification with only image-level labels is challenging due to the absence of lesion annotations.
Motivated by the symmetric prior, we can explore a counterfactual problem that how would the features have behaved if there were no lesions in the image.
We derive a new theoretical result for counterfactual generation based on the symmetric prior.
arXiv Detail & Related papers (2020-09-30T03:15:30Z) - StyPath: Style-Transfer Data Augmentation For Robust Histology Image
Classification [6.690876060631452]
We propose a novel pipeline to build robust deep neural networks for AMR classification based on StyPath.
Each image was generated in 1.84 + 0.03 seconds using a single GTX V TITAN and pytorch.
Our results imply that our style-transfer augmentation technique improves histological classification performance.
arXiv Detail & Related papers (2020-07-09T18:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.