Related papers: AugDiff: Diffusion based Feature Augmentation for Multiple Instance Learning in Whole Slide Image

AugDiff: Diffusion based Feature Augmentation for Multiple Instance Learning in Whole Slide Image

URL: http://arxiv.org/abs/2303.06371v1
Date: Sat, 11 Mar 2023 10:36:27 GMT
Title: AugDiff: Diffusion based Feature Augmentation for Multiple Instance Learning in Whole Slide Image
Authors: Zhuchen Shao, Liuxi Dai, Yifeng Wang, Haoqian Wang, Yongbing Zhang
Abstract summary: Multiple Instance Learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel Whole Slide Images (WSIs) We introduce the Diffusion Model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff. We conduct extensive experiments over three distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff.
Score: 15.180437840817788
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Multiple Instance Learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel Whole Slide Images (WSIs). However, the tens of thousands of patches in WSIs usually incur a vast computational burden for image augmentation, limiting the MIL model's improvement in performance. Currently, the feature augmentation-based MIL framework is a promising solution, while existing methods such as Mixup often produce unrealistic features. To explore a more efficient and practical augmentation method, we introduce the Diffusion Model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff. Specifically, we employ the generation diversity of DM to improve the quality of feature augmentation and the step-by-step generation property to control the retention of semantic information. We conduct extensive experiments over three distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff. Ablation study and visualization further verify the effectiveness. Moreover, we highlight AugDiff's higher-quality augmented feature over image augmentation and its superiority over self-supervised learning. The generalization over external datasets indicates its broader applications.

Related papers

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers [86.5541501589166]
DiffMoE is a batch-level global token pool that enables experts to access global token distributions during training. It achieves state-of-the-art performance among diffusion models on ImageNet benchmark. The effectiveness of our approach extends beyond class-conditional generation to more challenging tasks such as text-to-image generation.
arXiv Detail & Related papers (2025-03-18T17:57:07Z)
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving [52.83707400688378]
LargeAD is a versatile and scalable framework designed for large-scale 3D pretraining across diverse real-world driving datasets. Our framework leverages VFMs to extract semantically rich superpixels from 2D images, which are aligned with LiDAR point clouds to generate high-quality contrastive samples. Our approach delivers significant performance improvements over state-of-the-art methods in both linear probing and fine-tuning tasks for both LiDAR-based segmentation and object detection.
arXiv Detail & Related papers (2025-01-07T18:59:59Z)
SGIA: Enhancing Fine-Grained Visual Classification with Sequence Generative Image Augmentation [16.642582574494742]
We introduce Sequence Generative Image Augmentation (SGIA) for augmenting Fine-Grained Visual Classification (FGVC) datasets. Our method features a unique Bridging Transfer Learning process, designed to minimize the domain gap between real and synthetically augmented data. Our work sets a new benchmark and outperforms the previous state-of-the-art models in classification accuracy by 0.5% for the CUB-200-2011 dataset.
arXiv Detail & Related papers (2024-12-09T01:39:46Z)
MAT: Multi-Range Attention Transformer for Efficient Image Super-Resolution [14.265237560766268]
A flexible integration of attention across diverse spatial extents can yield significant performance enhancements. We introduce Multi-Range Attention Transformer (MAT) tailored for Super Resolution (SR) tasks. MAT adeptly capture dependencies across various spatial ranges, improving the diversity and efficacy of its feature representations.
arXiv Detail & Related papers (2024-11-26T08:30:31Z)
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning [38.26304604660713]
ADEM-VL is an efficient vision-language method that tunes models based on pretrained large language models. Our framework surpasses existing methods by an average accuracy of 0.77% on ScienceQA dataset.
arXiv Detail & Related papers (2024-10-23T11:31:06Z)
Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look [28.350278251132078]
We propose a unified framework to conduct data augmentation in the feature space, known as feature augmentation. This strategy is domain-agnostic, which augments similar features to the original ones and thus improves the data diversity.
arXiv Detail & Related papers (2024-10-16T09:25:11Z)
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct [148.39859547619156]
We propose MMEvol, a novel multimodal instruction data evolution framework. MMEvol iteratively improves data quality through a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution. Our approach reaches state-of-the-art (SOTA) performance in nine tasks using significantly less data compared to state-of-the-art models.
arXiv Detail & Related papers (2024-09-09T17:44:00Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
RangeAugment: Efficient Online Augmentation with Range Learning [54.61514286212455]
RangeAugment efficiently learns the range of magnitudes for individual as well as composite augmentation operations. We show that RangeAugment achieves competitive performance to state-of-the-art automatic augmentation methods with 4-5 times fewer augmentation operations.
arXiv Detail & Related papers (2022-12-20T18:55:54Z)
Local Magnification for Data and Feature Augmentation [53.04028225837681]
We propose an easy-to-implement and model-free data augmentation method called Local Magnification (LOMA) LOMA generates additional training data by randomly magnifying a local area of the image. Experiments show that our proposed LOMA, though straightforward, can be combined with standard data augmentation to significantly improve the performance on image classification and object detection.
arXiv Detail & Related papers (2022-11-15T02:51:59Z)
Learning Representational Invariances for Data-Efficient Action Recognition [52.23716087656834]
We show that our data augmentation strategy leads to promising performance on the Kinetics-100, UCF-101, and HMDB-51 datasets. We also validate our data augmentation strategy in the fully supervised setting and demonstrate improved performance.
arXiv Detail & Related papers (2021-03-30T17:59:49Z)
MA 3 : Model Agnostic Adversarial Augmentation for Few Shot learning [5.854757988966379]
In this paper, we explore the domain of few-shot learning with a novel augmentation technique. Our technique is fully differentiable which enables its extension to versatile data-sets and base models. We obtain an improvement of nearly 4% by adding our augmentation module without making any change in network architectures.
arXiv Detail & Related papers (2020-04-10T16:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.