Related papers: MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation

MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation

URL: http://arxiv.org/abs/2510.09121v2
Date: Mon, 20 Oct 2025 09:26:24 GMT
Title: MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation
Authors: Dominik Winter, Mai Bui, Monica Azqueta Gavaldon, Nicolas Triltsch, Marco Rosati, Nicolas Brieu,
Abstract summary: We introduce a Multimodal Semantic Diffusion Model for generating pixel-precise image-mask pairs for cell and nuclei segmentation.<n>By conditioning the generative process with cellular/nuclear morphologies, MSDM generates datasests with desired morphological properties.<n>We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models.
Score: 0.3650448386461648
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.

Related papers

SPATIA: Multimodal Model for Prediction and Generation of Spatial Cell Phenotypes [39.45743286683448]
We introduce SPATIA, a multi-scale generative and predictive model for spatial transcriptomics.<n> SPATIA learns cell-level embeddings by fusing image-derived morphological tokens and transcriptomic vector tokens.<n>We benchmark SPATIA against 13 existing models across 12 individual tasks.
arXiv Detail & Related papers (2025-07-07T06:54:02Z)
PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z)
HistoSmith: Single-Stage Histology Image-Label Generation via Conditional Latent Diffusion for Enhanced Cell Segmentation and Classification [0.19791587637442667]
This study introduces a novel single-stage approach for generating image-label pairs to augment histology datasets.<n>Unlike state-of-the-art methods that utilize diffusion models with separate components for label and image generation, our approach employs a latent diffusion model.<n>This model enables tailored data generation by conditioning on user-defined parameters such as cell types, quantities, and tissue types.
arXiv Detail & Related papers (2025-02-12T19:51:41Z)
MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z)
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen [76.02070962797794]
This work introduces CellFlow for Generation (CFGen), a flow-based conditional generative model that preserves the inherent discreteness of single-cell data.<n>CFGen generates whole-genome multi-modal single-cell data reliably, improving the recovery of crucial biological data characteristics.
arXiv Detail & Related papers (2024-07-16T14:05:03Z)
Practical Guidelines for Cell Segmentation Models Under Optical Aberrations in Microscopy [14.042884268397058]
This study evaluates cell image segmentation models under optical aberrations from fluorescence and bright field microscopy. We train and test several segmentation models, including the Otsu threshold method and Mask R-CNN with different network heads. In contrast, Cellpose 2.0 proves effective for complex cell images under similar conditions.
arXiv Detail & Related papers (2024-04-12T15:45:26Z)
Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL) Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z)
NASDM: Nuclei-Aware Semantic Histopathology Image Generation Using Diffusion Models [3.2996723916635267]
First-of-its-kind nuclei-aware semantic tissue generation framework (NASDM) NASDM can synthesize realistic tissue samples given a semantic instance mask of up to six different nuclei types. These synthetic images are useful in applications in pathology, validation of models, and supplementation of existing nuclei segmentation datasets.
arXiv Detail & Related papers (2023-03-20T22:16:03Z)
AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images. AMIGO uses the celluar graph within the tissue to provide a single representation for a patient. We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z)
Semi-Supervised Segmentation of Mitochondria from Electron Microscopy Images Using Spatial Continuity [3.631638087834872]
We propose a semi-supervised deep learning model that segments mitochondria by leveraging the spatial continuity of their structural, morphological, and contextual information. Our model achieves performance similar to that of state-of-the-art fully supervised models but requires only 20% of their annotated training data.
arXiv Detail & Related papers (2022-06-06T06:52:19Z)
Enforcing Morphological Information in Fully Convolutional Networks to Improve Cell Instance Segmentation in Fluorescence Microscopy Images [1.408123603417833]
We propose a novel cell instance segmentation approach based on the well-known U-Net architecture. To enforce the learning of morphological information per pixel, a deep distance transformer (DDT) acts as a back-bone model. The obtained results suggest a performance boost over traditional U-Net architectures.
arXiv Detail & Related papers (2021-06-10T15:54:38Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.