Controllable Latent Space Augmentation for Digital Pathology
- URL: http://arxiv.org/abs/2508.14588v1
- Date: Wed, 20 Aug 2025 10:11:48 GMT
- Title: Controllable Latent Space Augmentation for Digital Pathology
- Authors: Sofiène Boutaj, Marin Scalbert, Pierre Marza, Florent Couzinie-Devy, Maria Vakalopoulou, Stergios Christodoulidis,
- Abstract summary: HistAug is a fast and efficient generative model for controllable augmentations in the latent space for digital pathology.<n>Our method allows the processing of a large number of patches in a single forward pass efficiently.
- Score: 2.2062051154292157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Whole slide image (WSI) analysis in digital pathology presents unique challenges due to the gigapixel resolution of WSIs and the scarcity of dense supervision signals. While Multiple Instance Learning (MIL) is a natural fit for slide-level tasks, training robust models requires large and diverse datasets. Even though image augmentation techniques could be utilized to increase data variability and reduce overfitting, implementing them effectively is not a trivial task. Traditional patch-level augmentation is prohibitively expensive due to the large number of patches extracted from each WSI, and existing feature-level augmentation methods lack control over transformation semantics. We introduce HistAug, a fast and efficient generative model for controllable augmentations in the latent space for digital pathology. By conditioning on explicit patch-level transformations (e.g., hue, erosion), HistAug generates realistic augmented embeddings while preserving initial semantic information. Our method allows the processing of a large number of patches in a single forward pass efficiently, while at the same time consistently improving MIL model performance. Experiments across multiple slide-level tasks and diverse organs show that HistAug outperforms existing methods, particularly in low-data regimes. Ablation studies confirm the benefits of learned transformations over noise-based perturbations and highlight the importance of uniform WSI-wise augmentation. Code is available at https://github.com/MICS-Lab/HistAug.
Related papers
- Granular-ball Guided Masking: Structure-aware Data Augmentation [97.18560547134587]
Granular-ball Guided Masking (GBGM) is a structure-aware augmentation strategy guided by Granular-ball Computing (GBC)<n>GBGM adaptively preserves semantically rich, structurally important regions while suppressing redundant areas through a coarse-to-fine hierarchical masking process.<n>Experiments on multiple benchmarks demonstrate consistent improvements in classification accuracy and masked image reconstruction.
arXiv Detail & Related papers (2025-12-24T07:15:33Z) - Reducing Variability of Multiple Instance Learning Methods for Digital Pathology [2.9284034606635267]
Digital pathology has revolutionized the field by enabling the digitization of tissue samples into whole slide images (WSIs)<n>WSIs are often divided into smaller patches with a global label.<n>MIL methods have emerged as a suitable solution for WSI classification.
arXiv Detail & Related papers (2025-06-30T22:10:24Z) - A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training [54.074020740827855]
We find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways.<n>This simple and compact design is interpretable, and easily benefits from newly found augmentations.<n>We extensively validate the superiority and versatility of our approach on conventional human body images, overhead fisheye images, and human hand images.
arXiv Detail & Related papers (2024-02-18T12:27:59Z) - A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - Task-specific Fine-tuning via Variational Information Bottleneck for
Weakly-supervised Pathology Whole Slide Image Classification [10.243293283318415]
Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification.
We propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory.
Our framework is evaluated on five pathology WSI datasets on various WSI heads.
arXiv Detail & Related papers (2023-03-15T08:41:57Z) - AugDiff: Diffusion based Feature Augmentation for Multiple Instance
Learning in Whole Slide Image [15.180437840817788]
Multiple Instance Learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel Whole Slide Images (WSIs)
We introduce the Diffusion Model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff.
We conduct extensive experiments over three distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff.
arXiv Detail & Related papers (2023-03-11T10:36:27Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Local Magnification for Data and Feature Augmentation [53.04028225837681]
We propose an easy-to-implement and model-free data augmentation method called Local Magnification (LOMA)
LOMA generates additional training data by randomly magnifying a local area of the image.
Experiments show that our proposed LOMA, though straightforward, can be combined with standard data augmentation to significantly improve the performance on image classification and object detection.
arXiv Detail & Related papers (2022-11-15T02:51:59Z) - Embedding Space Augmentation for Weakly Supervised Learning in
Whole-Slide Images [3.858809922365453]
Multiple Instance Learning (MIL) is a widely employed framework for learning on gigapixel whole-slide images (WSIs) from WSI-level annotations.
We present EmbAugmenter, a data augmentation generative adversarial network (DA-GAN) that can synthesize data augmentations in the embedding space rather than in the pixel space.
Our approach outperforms MIL without augmentation and is on par with traditional patch-level augmentation for MIL training while being substantially faster.
arXiv Detail & Related papers (2022-10-31T02:06:39Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.