Learning from the Right Patches: A Two-Stage Wavelet-Driven Masked Autoencoder for Histopathology Representation Learning
- URL: http://arxiv.org/abs/2511.06958v2
- Date: Wed, 19 Nov 2025 09:42:12 GMT
- Title: Learning from the Right Patches: A Two-Stage Wavelet-Driven Masked Autoencoder for Histopathology Representation Learning
- Authors: Raneen Younis, Louay Hamdi, Lukas Chavez, Zahra Ahmadi,
- Abstract summary: Whole-slide images are central to digital pathology, yet their extreme size and scarce annotations make self-supervised learning essential.<n>We present a framework that brings structure and biological relevance into MAE-based learning through a wavelet-informed patch selection strategy.<n>We show that WISE-MAE achieves competitive representation quality and downstream classification performance while maintaining efficiency under weak supervision.
- Score: 4.281508114645598
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Whole-slide images are central to digital pathology, yet their extreme size and scarce annotations make self-supervised learning essential. Masked Autoencoders (MAEs) with Vision Transformer backbones have recently shown strong potential for histopathology representation learning. However, conventional random patch sampling during MAE pretraining often includes irrelevant or noisy regions, limiting the model's ability to capture meaningful tissue patterns. In this paper, we present a lightweight and domain-adapted framework that brings structure and biological relevance into MAE-based learning through a wavelet-informed patch selection strategy. WISE-MAE applies a two-step coarse-to-fine process: wavelet-based screening at low magnification to locate structurally rich regions, followed by high-resolution extraction for detailed modeling. This approach mirrors the diagnostic workflow of pathologists and improves the quality of learned representations. Evaluations across multiple cancer datasets, including lung, renal, and colorectal tissues, show that WISE-MAE achieves competitive representation quality and downstream classification performance while maintaining efficiency under weak supervision.
Related papers
- A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtyping [1.927195358774599]
Pretraining on large-scale, in-domain datasets grants histopathology foundation models (FM) the ability to learn task-agnostic data representations.<n>In computational pathology, automated whole slide image analysis requires multiple instance learning (MIL) frameworks due to the gigapixel scale of the slides.<n>Our work presents a novel benchmark for evaluating histopathology FMs as patch-level feature extractors within a MIL classification framework.
arXiv Detail & Related papers (2025-06-23T14:12:16Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images [7.048241543461529]
We propose a novel framework called Multi-Resolution Prompt-guided Hybrid Embedding (MR-PHE) to address these challenges in zero-shot histopathology image classification.<n>We introduce a hybrid embedding strategy that integrates global image embeddings with weighted patch embeddings.<n>A similarity-based patch weighting mechanism assigns attention-like weights to patches based on their relevance to class embeddings.
arXiv Detail & Related papers (2025-03-13T12:18:37Z) - Pathological Prior-Guided Multiple Instance Learning For Mitigating Catastrophic Forgetting in Breast Cancer Whole Slide Image Classification [50.899861205016265]
We propose a new framework PaGMIL to mitigate catastrophic forgetting in breast cancer WSI classification.<n>Our framework introduces two key components into the common MIL model architecture.<n>We evaluate the continual learning performance of PaGMIL across several public breast cancer datasets.
arXiv Detail & Related papers (2025-03-08T04:51:58Z) - Enhancing Weakly-Supervised Histopathology Image Segmentation with Knowledge Distillation on MIL-Based Pseudo-Labels [8.934328206473456]
We propose a novel distillation framework for histopathology image segmentation.
This framework introduces a iterative fusion-knowledge distillation strategy, enabling the student model to learn directly from the teacher's comprehensive outcomes.
arXiv Detail & Related papers (2024-07-14T17:15:47Z) - Enhancing AI Diagnostics: Autonomous Lesion Masking via Semi-Supervised Deep Learning [1.4053129774629076]
This study presents an unsupervised domain adaptation method aimed at autonomously generating image masks outlining regions of interest (ROIs) for differentiating breast lesions in breast ultrasound (US) imaging.
Our semi-supervised learning approach utilizes a primitive model trained on a small public breast US dataset with true annotations.
This model is then iteratively refined for the domain adaptation task, generating pseudo-masks for our private, unannotated breast US dataset.
arXiv Detail & Related papers (2024-04-18T18:25:00Z) - Active Learning Enhances Classification of Histopathology Whole Slide
Images with Attention-based Multiple Instance Learning [48.02011627390706]
We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation.
With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class.
It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.
arXiv Detail & Related papers (2023-03-02T15:18:58Z) - Orientation-Shared Convolution Representation for CT Metal Artifact
Learning [63.67718355820655]
During X-ray computed tomography (CT) scanning, metallic implants carrying with patients often lead to adverse artifacts.
Existing deep-learning-based methods have gained promising reconstruction performance.
We propose an orientation-shared convolution representation strategy to adapt the physical prior structures of artifacts.
arXiv Detail & Related papers (2022-12-26T13:56:12Z) - Global Contrast Masked Autoencoders Are Powerful Pathological
Representation Learners [11.162001837248166]
We propose a self-supervised learning model, the global contrast-masked autoencoder (GCMAE), which can train the encoder to have the ability to represent local-global features of pathological images.
The ability of the GCMAE to learn migratable representations was demonstrated through extensive experiments using a total of three different disease-specific hematoxylin and eosin (HE)-stained pathology datasets.
arXiv Detail & Related papers (2022-05-18T16:28:56Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.