AttenScribble: Attentive Similarity Learning for Scribble-Supervised
Medical Image Segmentation
- URL: http://arxiv.org/abs/2312.06614v1
- Date: Mon, 11 Dec 2023 18:42:18 GMT
- Title: AttenScribble: Attentive Similarity Learning for Scribble-Supervised
Medical Image Segmentation
- Authors: Mu Tian, Qinzhu Yang, Yi Gao
- Abstract summary: In this paper, we present a straightforward yet effective scribble supervised learning framework.
We create a pluggable spatial self-attention module which could be attached on top of any internal feature layers of arbitrary fully convolutional network (FCN) backbone.
This attentive similarity leads to a novel regularization loss that imposes consistency between segmentation prediction and visual affinity.
- Score: 5.8447004333496855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of deep networks in medical image segmentation relies heavily on
massive labeled training data. However, acquiring dense annotations is a
time-consuming process. Weakly-supervised methods normally employ less
expensive forms of supervision, among which scribbles started to gain
popularity lately thanks to its flexibility. However, due to lack of shape and
boundary information, it is extremely challenging to train a deep network on
scribbles that generalizes on unlabeled pixels. In this paper, we present a
straightforward yet effective scribble supervised learning framework. Inspired
by recent advances of transformer based segmentation, we create a pluggable
spatial self-attention module which could be attached on top of any internal
feature layers of arbitrary fully convolutional network (FCN) backbone. The
module infuses global interaction while keeping the efficiency of convolutions.
Descended from this module, we construct a similarity metric based on
normalized and symmetrized attention. This attentive similarity leads to a
novel regularization loss that imposes consistency between segmentation
prediction and visual affinity. This attentive similarity loss optimizes the
alignment of FCN encoders, attention mapping and model prediction. Ultimately,
the proposed FCN+Attention architecture can be trained end-to-end guided by a
combination of three learning objectives: partial segmentation loss, a
customized masked conditional random fields and the proposed attentive
similarity loss. Extensive experiments on public datasets (ACDC and CHAOS)
showed that our framework not just out-performs existing state-of-the-art, but
also delivers close performance to fully-supervised benchmark. Code will be
available upon publication.
Related papers
- ResCLIP: Residual Attention for Training-free Dense Vision-language Inference [27.551367463011008]
Cross-correlation of self-attention in CLIP's non-final layers also exhibits localization properties.
We propose the Residual Cross-correlation Self-attention (RCS) module, which leverages the cross-correlation self-attention from intermediate layers to remold the attention in the final block.
The RCS module effectively reorganizes spatial information, unleashing the localization potential within CLIP for dense vision-language inference.
arXiv Detail & Related papers (2024-11-24T14:14:14Z) - Unifying and Personalizing Weakly-supervised Federated Medical Image
Segmentation via Adaptive Representation and Aggregation [1.121358474059223]
Federated learning (FL) enables multiple sites to collaboratively train powerful deep models without compromising data privacy and security.
Weakly supervised segmentation, which uses sparsely-grained supervision, is increasingly being paid attention to due to its great potential of reducing annotation costs.
We propose a novel personalized FL framework for medical image segmentation, named FedICRA, which uniformly leverages heterogeneous weak supervision.
arXiv Detail & Related papers (2023-04-12T06:32:08Z) - Data Augmentation-free Unsupervised Learning for 3D Point Cloud
Understanding [61.30276576646909]
We propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu.
We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task.
arXiv Detail & Related papers (2022-10-06T10:18:16Z) - FV-UPatches: Enhancing Universality in Finger Vein Recognition [0.6299766708197883]
We propose a universal learning-based framework, which achieves generalization while training with limited data.
The proposed framework shows application potential in other vein-based biometric recognition as well.
arXiv Detail & Related papers (2022-06-02T14:20:22Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Semi-supervised Left Atrium Segmentation with Mutual Consistency
Training [60.59108570938163]
We propose a novel Mutual Consistency Network (MC-Net) for semi-supervised left atrium segmentation from 3D MR images.
Our MC-Net consists of one encoder and two slightly different decoders, and the prediction discrepancies of two decoders are transformed as an unsupervised loss.
We evaluate our MC-Net on the public Left Atrium (LA) database and it obtains impressive performance gains by exploiting the unlabeled data effectively.
arXiv Detail & Related papers (2021-03-04T09:34:32Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Learning and Exploiting Interclass Visual Correlations for Medical Image
Classification [30.88175218665726]
We present the Class-Correlation Learning Network (CCL-Net) to learn interclass visual correlations from given training data.
Instead of letting the network directly learn the desired correlations, we propose to learn them implicitly via distance metric learning of class-specific embeddings.
An intuitive loss based on a geometrical explanation of correlation is designed for bolstering learning of the interclass correlations.
arXiv Detail & Related papers (2020-07-13T13:31:38Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.