A Dual-branch Self-supervised Representation Learning Framework for
Tumour Segmentation in Whole Slide Images
- URL: http://arxiv.org/abs/2303.11019v1
- Date: Mon, 20 Mar 2023 10:57:28 GMT
- Title: A Dual-branch Self-supervised Representation Learning Framework for
Tumour Segmentation in Whole Slide Images
- Authors: Hao Wang, Euijoon Ahn, Jinman Kim
- Abstract summary: Self-supervised learning (SSL) has emerged as an alternative solution to reduce the annotation overheads in whole slide images.
These SSL approaches are not designed for handling multi-resolution WSIs, which limits their performance in learning discriminative image features.
We propose a Dual-branch SSL Framework for WSI tumour segmentation (DSF-WSI) that can effectively learn image features from multi-resolution WSIs.
- Score: 12.961686610789416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised deep learning methods have achieved considerable success in
medical image analysis, owing to the availability of large-scale and
well-annotated datasets. However, creating such datasets for whole slide images
(WSIs) in histopathology is a challenging task due to their gigapixel size. In
recent years, self-supervised learning (SSL) has emerged as an alternative
solution to reduce the annotation overheads in WSIs, as it does not require
labels for training. These SSL approaches, however, are not designed for
handling multi-resolution WSIs, which limits their performance in learning
discriminative image features. In this paper, we propose a Dual-branch SSL
Framework for WSI tumour segmentation (DSF-WSI) that can effectively learn
image features from multi-resolution WSIs. Our DSF-WSI connected two branches
and jointly learnt low and high resolution WSIs in a self-supervised manner.
Moreover, we introduced a novel Context-Target Fusion Module (CTFM) and a
masked jigsaw pretext task to align the learnt multi-resolution features.
Furthermore, we designed a Dense SimSiam Learning (DSL) strategy to maximise
the similarity of different views of WSIs, enabling the learnt representations
to be more efficient and discriminative. We evaluated our method using two
public datasets on breast and liver cancer segmentation tasks. The experiment
results demonstrated that our DSF-WSI can effectively extract robust and
efficient representations, which we validated through subsequent fine-tuning
and semi-supervised settings. Our proposed method achieved better accuracy than
other state-of-the-art approaches. Code is available at
https://github.com/Dylan-H-Wang/dsf-wsi.
Related papers
- DINOv2 based Self Supervised Learning For Few Shot Medical Image
Segmentation [33.471116581196796]
Few-shot segmentation offers a promising solution by endowing models with the capacity to learn novel classes from limited labeled examples.
A leading method for FSS is ALPNet, which compares features between the query image and the few available support segmented images.
We present a novel approach to few-shot segmentation that not only enhances performance but also paves the way for more robust and adaptable medical image analysis.
arXiv Detail & Related papers (2024-03-05T19:13:45Z) - Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation [11.637738540262797]
This study introduces Semi-Mamba-UNet, which integrates a purely visual Mamba-based encoder-decoder architecture with a conventional CNN-based UNet into a semi-supervised learning framework.
This innovative SSL approach leverages both networks to generate pseudo-labels and cross-supervise one another at the pixel level simultaneously.
We introduce a self-supervised pixel-level contrastive learning strategy that employs a pair of projectors to enhance the feature learning capabilities further.
arXiv Detail & Related papers (2024-02-11T17:09:21Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - CMID: A Unified Self-Supervised Learning Framework for Remote Sensing
Image Understanding [20.2438336674081]
Contrastive Mask Image Distillation (CMID) is capable of learning representations with both global semantic separability and local spatial perceptibility.
CMID is compatible with both convolutional neural networks (CNN) and vision transformers (ViT)
Models pre-trained using CMID achieve better performance than other state-of-the-art SSL methods on multiple downstream tasks.
arXiv Detail & Related papers (2023-04-19T13:58:31Z) - Localized Region Contrast for Enhancing Self-Supervised Learning in
Medical Image Segmentation [27.82940072548603]
We propose a novel contrastive learning framework that integrates Localized Region Contrast (LRC) to enhance existing self-supervised pre-training methods for medical image segmentation.
Our approach involves identifying Super-pixels by Felzenszwalb's algorithm and performing local contrastive learning using a novel contrastive sampling loss.
arXiv Detail & Related papers (2023-04-06T22:43:13Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Consistency Regularisation in Varying Contexts and Feature Perturbations
for Semi-Supervised Semantic Segmentation of Histology Images [14.005379068469361]
We present a consistency based semi-supervised learning (SSL) approach that can help mitigate this challenge.
SSL models might also be susceptible to changing context and features perturbations exhibiting poor generalisation due to the limited training data.
We show that cross-consistency training makes the encoder features invariant to different perturbations and improves the prediction confidence.
arXiv Detail & Related papers (2023-01-30T18:21:57Z) - Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - Spatial-Temporal Multi-Cue Network for Continuous Sign Language
Recognition [141.24314054768922]
We propose a spatial-temporal multi-cue (STMC) network to solve the vision-based sequence learning problem.
To validate the effectiveness, we perform experiments on three large-scale CSLR benchmarks.
arXiv Detail & Related papers (2020-02-08T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.