Are Vision Foundation Models Foundational for Electron Microscopy Image Segmentation?
- URL: http://arxiv.org/abs/2602.08505v1
- Date: Mon, 09 Feb 2026 10:55:18 GMT
- Title: Are Vision Foundation Models Foundational for Electron Microscopy Image Segmentation?
- Authors: Caterina Fuster-Barceló, Virginie Uhlmann,
- Abstract summary: We study the problem of mitochondria electron microscopy (EM) images using two popular public datasets (Lucchi++ and OpenCLIP)<n>We observe that training on a single EM dataset yields good segmentation performance (quantified as foreground Intersection-over-Union)<n>Training on multiple EM datasets leads to severe performance degradation for all models considered.
- Score: 0.6302369456012739
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although vision foundation models (VFMs) are increasingly reused for biomedical image analysis, it remains unclear whether the latent representations they provide are general enough to support effective transfer and reuse across heterogeneous microscopy image datasets. Here, we study this question for the problem of mitochondria segmentation in electron microscopy (EM) images, using two popular public EM datasets (Lucchi++ and VNC) and three recent representative VFMs (DINOv2, DINOv3, and OpenCLIP). We evaluate two practical model adaptation regimes: a frozen-backbone setting in which only a lightweight segmentation head is trained on top of the VFM, and parameter-efficient fine-tuning (PEFT) via Low-Rank Adaptation (LoRA) in which the VFM is fine-tuned in a targeted manner to a specific dataset. Across all backbones, we observe that training on a single EM dataset yields good segmentation performance (quantified as foreground Intersection-over-Union), and that LoRA consistently improves in-domain performance. In contrast, training on multiple EM datasets leads to severe performance degradation for all models considered, with only marginal gains from PEFT. Exploration of the latent representation space through various techniques (PCA, Fréchet Dinov2 distance, and linear probes) reveals a pronounced and persistent domain mismatch between the two considered EM datasets in spite of their visual similarity, which is consistent with the observed failure of paired training. These results suggest that, while VFMs can deliver competitive results for EM segmentation within a single domain under lightweight adaptation, current PEFT strategies are insufficient to obtain a single robust model across heterogeneous EM datasets without additional domain-alignment mechanisms.
Related papers
- AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors [6.6016630449883955]
AnomalyVFM is a framework that turns any pretrained VFM into a strong zero-shot anomaly detector.<n>Our approach combines a robust three-stage synthetic dataset generation scheme with a parameter-efficient adaptation mechanism.<n>It achieves an average image-level AUROC of 94.1% across 9 diverse datasets, surpassing previous methods by significant 3.3 percentage points.
arXiv Detail & Related papers (2026-01-28T12:02:58Z) - Data Efficiency and Transfer Robustness in Biomedical Image Segmentation: A Study of Redundancy and Forgetting with Cellpose [2.37771211295026]
Generalist biomedical image segmentation models such as Cellpose are increasingly applied across diverse imaging modalities and cell types.<n>In this study, we conduct a systematic empirical analysis of challenges using Cellpose as a case study.<n>To assess data redundancy, we propose a simple dataset quantization (DQ) strategy for constructing compact yet diverse training subsets.<n>To examine catastrophic forgetting, we perform cross domain finetuning experiments and observe significant degradation in source domain performance.
arXiv Detail & Related papers (2025-11-06T20:45:56Z) - Can Diffusion Models Bridge the Domain Gap in Cardiac MR Imaging? [2.0202525145391093]
We propose a diffusion model (DM) trained on a source domain that generates synthetic cardiac MR images that resemble a given reference.<n>The synthetic data maintains spatial and structural fidelity, ensuring similarity to the source domain and compatibility with the segmentation mask.<n>We explore domain generalisation, where, domain-invariant segmentation models are trained on synthetic source domain data, and domain adaptation, where, we shift target domain data towards the source domain using the DM.
arXiv Detail & Related papers (2025-08-08T13:57:48Z) - Re-Visible Dual-Domain Self-Supervised Deep Unfolding Network for MRI Reconstruction [48.30341580103962]
We propose a novel re-visible dual-domain self-supervised deep unfolding network to address these issues.<n>We design a deep unfolding network based on Chambolle and Pock Proximal Point Algorithm (DUN-CP-PPA) to achieve end-to-end reconstruction.<n> Experiments conducted on the fastMRI and IXI datasets demonstrate that our method significantly outperforms state-of-the-art approaches in terms of reconstruction performance.
arXiv Detail & Related papers (2025-01-07T12:29:32Z) - ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation [49.42525661521625]
This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation.
It is tested over a wide range of EM images, covering five segmentation tasks and 10 datasets.
arXiv Detail & Related papers (2024-08-26T08:59:22Z) - SAMDA: Leveraging SAM on Few-Shot Domain Adaptation for Electronic
Microscopy Segmentation [3.7562258027956186]
We present a new few-shot domain adaptation framework SAMDA.
It combines the Segment Anything Model(SAM) with nnUNet in the embedding space to achieve high transferability and accuracy.
arXiv Detail & Related papers (2024-03-12T02:28:29Z) - Transferring Ultrahigh-Field Representations for Intensity-Guided Brain
Segmentation of Low-Field Magnetic Resonance Imaging [51.92395928517429]
The use of 7T MRI is limited by its high cost and lower accessibility compared to low-field (LF) MRI.
This study proposes a deep-learning framework that fuses the input LF magnetic resonance feature representations with the inferred 7T-like feature representations for brain image segmentation tasks.
arXiv Detail & Related papers (2024-02-13T12:21:06Z) - Learning Multiscale Consistency for Self-supervised Electron Microscopy
Instance Segmentation [48.267001230607306]
We propose a pretraining framework that enhances multiscale consistency in EM volumes.
Our approach leverages a Siamese network architecture, integrating strong and weak data augmentations.
It effectively captures voxel and feature consistency, showing promise for learning transferable representations for EM analysis.
arXiv Detail & Related papers (2023-08-19T05:49:13Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Cross-Modality Brain Tumor Segmentation via Bidirectional
Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme.
Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor.
The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z) - Cross-Domain Segmentation with Adversarial Loss and Covariate Shift for
Biomedical Imaging [2.1204495827342438]
This manuscript aims to implement a novel model that can learn robust representations from cross-domain data by encapsulating distinct and shared patterns from different modalities.
The tests on CT and MRI liver data acquired in routine clinical trials show that the proposed model outperforms all other baseline with a large margin.
arXiv Detail & Related papers (2020-06-08T07:35:55Z) - Unsupervised Instance Segmentation in Microscopy Images via Panoptic
Domain Adaptation and Task Re-weighting [86.33696045574692]
We propose a Cycle Consistency Panoptic Domain Adaptive Mask R-CNN (CyC-PDAM) architecture for unsupervised nuclei segmentation in histopathology images.
We first propose a nuclei inpainting mechanism to remove the auxiliary generated objects in the synthesized images.
Secondly, a semantic branch with a domain discriminator is designed to achieve panoptic-level domain adaptation.
arXiv Detail & Related papers (2020-05-05T11:08:26Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.