Related papers: Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization

Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization

URL: http://arxiv.org/abs/2407.02900v1
Date: Wed, 3 Jul 2024 08:20:27 GMT
Title: Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization
Authors: Sebastian Doerrich, Francesco Di Salvo, Christian Ledig,
Abstract summary: We propose a novel generative method for domain generalization in histopathology images. Our method employs a generative, self-supervised Vision Transformer to dynamically extract characteristics of image patches. Experiments conducted on two distinct histopathology datasets demonstrate the effectiveness of our proposed approach.
Score: 0.13108652488669734
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite notable advancements, the integration of deep learning (DL) techniques into impactful clinical applications, particularly in the realm of digital histopathology, has been hindered by challenges associated with achieving robust generalization across diverse imaging domains and characteristics. Traditional mitigation strategies in this field such as data augmentation and stain color normalization have proven insufficient in addressing this limitation, necessitating the exploration of alternative methodologies. To this end, we propose a novel generative method for domain generalization in histopathology images. Our method employs a generative, self-supervised Vision Transformer to dynamically extract characteristics of image patches and seamlessly infuse them into the original images, thereby creating novel, synthetic images with diverse attributes. By enriching the dataset with such synthesized images, we aim to enhance its holistic nature, facilitating improved generalization of DL models to unseen domains. Extensive experiments conducted on two distinct histopathology datasets demonstrate the effectiveness of our proposed approach, outperforming the state of the art substantially, on the Camelyon17-wilds challenge dataset (+2%) and on a second epithelium-stroma dataset (+26%). Furthermore, we emphasize our method's ability to readily scale with increasingly available unlabeled data samples and more complex, higher parametric architectures. Source code is available at https://github.com/sdoerrich97/vits-are-generative-models .

Related papers

Unleashing the Potential of Synthetic Images: A Study on Histopathology Image Classification [0.12499537119440242]
Histopathology image classification is crucial for the accurate identification and diagnosis of various diseases. We show that synthetic images can effectively augment existing datasets, ultimately improving the performance of the downstream histopathology image classification task.
arXiv Detail & Related papers (2024-09-24T12:02:55Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability. We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images. Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z)
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models. In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques. We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z)
Generalizing Across Domains in Diabetic Retinopathy via Variational Autoencoders [0.0]
Domain generalization for Diabetic Retinopathy classification allows a model to adeptly classify retinal images. In this study, we explore the inherent capacity of variational autoencoders to disentangle the latent space of fundus images.
arXiv Detail & Related papers (2023-09-20T13:29:22Z)
ViT-DAE: Transformer-driven Diffusion Autoencoder for Histopathology Image Analysis [4.724009208755395]
We present ViT-DAE, which integrates vision transformers (ViT) and diffusion autoencoders for high-quality histopathology image synthesis. Our approach outperforms recent GAN-based and vanilla DAE methods in generating realistic images.
arXiv Detail & Related papers (2023-04-03T15:00:06Z)
Unsupervised Domain Transfer with Conditional Invertible Neural Networks [83.90291882730925]
We propose a domain transfer approach based on conditional invertible neural networks (cINNs) Our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood. Our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks.
arXiv Detail & Related papers (2023-03-17T18:00:27Z)
AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images. AMIGO uses the celluar graph within the tissue to provide a single representation for a patient. We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z)
Unified Framework for Histopathology Image Augmentation and Classification via Generative Models [6.404713841079193]
We propose an innovative unified framework that integrates the data generation and model training stages into a unified process. Our approach utilizes a pure Vision Transformer (ViT)-based conditional Generative Adversarial Network (cGAN) model to simultaneously handle both image synthesis and classification. Our experiments show that our unified synthetic augmentation framework consistently enhances the performance of histopathology image classification models.
arXiv Detail & Related papers (2022-12-20T03:40:44Z)
Histopathology DatasetGAN: Synthesizing Large-Resolution Histopathology Datasets [0.0]
Histopathology datasetGAN (HDGAN) is a framework for image generation and segmentation that scales well to large-resolution histopathology images. We make several adaptations from the original framework, including updating the generative backbone, selectively extracting latent features from the generator, and switching to memory-mapped arrays. We evaluate HDGAN on a thrombotic microangiopathy high-resolution tile dataset, demonstrating strong performance on the high-resolution image-annotation generation task.
arXiv Detail & Related papers (2022-07-06T14:33:50Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.