Self-Supervised Vision Transformers Learn Visual Concepts in
Histopathology
- URL: http://arxiv.org/abs/2203.00585v1
- Date: Tue, 1 Mar 2022 16:14:41 GMT
- Title: Self-Supervised Vision Transformers Learn Visual Concepts in
Histopathology
- Authors: Richard J. Chen, Rahul G. Krishnan
- Abstract summary: We conduct a search for good representations in pathology by training a variety of self-supervised models with validation on a variety of weakly-supervised and patch-level tasks.
Our key finding is in discovering that Vision Transformers using DINO-based knowledge distillation are able to learn data-efficient and interpretable features in histology images.
- Score: 5.164102666113966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tissue phenotyping is a fundamental task in learning objective
characterizations of histopathologic biomarkers within the tumor-immune
microenvironment in cancer pathology. However, whole-slide imaging (WSI) is a
complex computer vision in which: 1) WSIs have enormous image resolutions with
precludes large-scale pixel-level efforts in data curation, and 2) diversity of
morphological phenotypes results in inter- and intra-observer variability in
tissue labeling. To address these limitations, current efforts have proposed
using pretrained image encoders (transfer learning from ImageNet,
self-supervised pretraining) in extracting morphological features from
pathology, but have not been extensively validated. In this work, we conduct a
search for good representations in pathology by training a variety of
self-supervised models with validation on a variety of weakly-supervised and
patch-level tasks. Our key finding is in discovering that Vision Transformers
using DINO-based knowledge distillation are able to learn data-efficient and
interpretable features in histology images wherein the different attention
heads learn distinct morphological phenotypes. We make evaluation code and
pretrained weights publicly-available at:
https://github.com/Richarizardd/Self-Supervised-ViT-Path.
Related papers
- Progressive Retinal Image Registration via Global and Local Deformable Transformations [49.032894312826244]
We propose a hybrid registration framework called HybridRetina.
We use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation.
Experiments on two widely-used datasets, FIRE and FLoRI21, show that our proposed HybridRetina significantly outperforms some state-of-the-art methods.
arXiv Detail & Related papers (2024-09-02T08:43:50Z) - Advancing Medical Image Segmentation: Morphology-Driven Learning with Diffusion Transformer [4.672688418357066]
We propose a novel Transformer Diffusion (DTS) model for robust segmentation in the presence of noise.
Our model, which analyzes the morphological representation of images, shows better results than the previous models in various medical imaging modalities.
arXiv Detail & Related papers (2024-08-01T07:35:54Z) - GPC: Generative and General Pathology Image Classifier [2.6954348706500766]
We propose a task-agnostic generative and general pathology image classifier, so called GPC.
GPC maps pathology images into a high-dimensional feature space and generates pertinent class labels as texts.
We evaluate GPC on six datasets for four different pathology image classification tasks.
arXiv Detail & Related papers (2024-07-12T06:54:31Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Towards a Visual-Language Foundation Model for Computational Pathology [5.72536252929528]
We introduce CONtrastive learning from Captions for Histopathology (CONCH)
CONCH is a visual-language foundation model developed using diverse sources of histopathology images, biomedical text, and task-agnostic pretraining.
It is evaluated on a suite of 13 diverse benchmarks, achieving state-of-the-art performance on histology image classification, segmentation, captioning, text-to-image and image-to-text retrieval.
arXiv Detail & Related papers (2023-07-24T16:13:43Z) - Deepfake histological images for enhancing digital pathology [0.40631409309544836]
We develop a generative adversarial network model that synthesizes pathology images constrained by class labels.
We investigate the ability of this framework in synthesizing realistic prostate and colon tissue images.
We extend the approach to significantly more complex images from colon biopsies and show that the complex microenvironment in such tissues can also be reproduced.
arXiv Detail & Related papers (2022-06-16T17:11:08Z) - Learning multi-scale functional representations of proteins from
single-cell microscopy data [77.34726150561087]
We show that simple convolutional networks trained on localization classification can learn protein representations that encapsulate diverse functional information.
We also propose a robust evaluation strategy to assess quality of protein representations across different scales of biological function.
arXiv Detail & Related papers (2022-05-24T00:00:07Z) - Intelligent Masking: Deep Q-Learning for Context Encoding in Medical
Image Analysis [48.02011627390706]
We develop a novel self-supervised approach that occludes targeted regions to improve the pre-training procedure.
We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
arXiv Detail & Related papers (2022-03-25T19:05:06Z) - Learning domain-agnostic visual representation for computational
pathology using medically-irrelevant style transfer augmentation [4.538771844947821]
STRAP (Style TRansfer Augmentation for histoPathology) is a form of data augmentation based on random style transfer from artistic paintings.
Style transfer replaces the low-level texture content of images with the uninformative style of randomly selected artistic paintings.
We demonstrate that STRAP leads to state-of-the-art performance, particularly in the presence of domain shifts.
arXiv Detail & Related papers (2021-02-02T18:50:16Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.