Related papers: AttenScribble: Attentive Similarity Learning for Scribble-Supervised Medical Image Segmentation

AttenScribble: Attentive Similarity Learning for Scribble-Supervised Medical Image Segmentation

URL: http://arxiv.org/abs/2312.06614v1
Date: Mon, 11 Dec 2023 18:42:18 GMT
Title: AttenScribble: Attentive Similarity Learning for Scribble-Supervised Medical Image Segmentation
Authors: Mu Tian, Qinzhu Yang, Yi Gao
Abstract summary: In this paper, we present a straightforward yet effective scribble supervised learning framework. We create a pluggable spatial self-attention module which could be attached on top of any internal feature layers of arbitrary fully convolutional network (FCN) backbone. This attentive similarity leads to a novel regularization loss that imposes consistency between segmentation prediction and visual affinity.
Score: 5.8447004333496855
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The success of deep networks in medical image segmentation relies heavily on massive labeled training data. However, acquiring dense annotations is a time-consuming process. Weakly-supervised methods normally employ less expensive forms of supervision, among which scribbles started to gain popularity lately thanks to its flexibility. However, due to lack of shape and boundary information, it is extremely challenging to train a deep network on scribbles that generalizes on unlabeled pixels. In this paper, we present a straightforward yet effective scribble supervised learning framework. Inspired by recent advances of transformer based segmentation, we create a pluggable spatial self-attention module which could be attached on top of any internal feature layers of arbitrary fully convolutional network (FCN) backbone. The module infuses global interaction while keeping the efficiency of convolutions. Descended from this module, we construct a similarity metric based on normalized and symmetrized attention. This attentive similarity leads to a novel regularization loss that imposes consistency between segmentation prediction and visual affinity. This attentive similarity loss optimizes the alignment of FCN encoders, attention mapping and model prediction. Ultimately, the proposed FCN+Attention architecture can be trained end-to-end guided by a combination of three learning objectives: partial segmentation loss, a customized masked conditional random fields and the proposed attentive similarity loss. Extensive experiments on public datasets (ACDC and CHAOS) showed that our framework not just out-performs existing state-of-the-art, but also delivers close performance to fully-supervised benchmark. Code will be available upon publication.

Related papers

Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning [81.02648336552421]
We propose a Multi-Constraint Consistency Learning approach to facilitate the staged enhancement of the encoder and decoder. Self-adaptive feature masking and noise injection are designed in an instance-specific manner to perturb the features for robust learning of the decoder. Experimental results on Pascal VOC2012 and Cityscapes datasets demonstrate that our proposed MCCL achieves new state-of-the-art performance.
arXiv Detail & Related papers (2025-03-23T03:21:33Z)
DAM-Seg: Anatomically accurate cardiac segmentation using Dense Associative Networks [3.776159955137874]
We propose a novel transformer-based architecture that leverages dense associative networks to learn and retain specific patterns inherent to cardiac inputs. Our approach restricts the network to a limited set of patterns. During forward propagation, a weighted sum of these patterns is used to enforce anatomical correctness in the output. Experimental results indicate that our model consistently outperforms baseline approaches across all metrics.
arXiv Detail & Related papers (2025-02-21T01:15:10Z)
HELPNet: Hierarchical Perturbations Consistency and Entropy-guided Ensemble for Scribble Supervised Medical Image Segmentation [4.034121387622003]
We propose HELPNet, a novel scribble-based weakly supervised segmentation framework. HELPNet integrates three modules to bridge the gap between annotation efficiency and segmentation performance. HELPNet significantly outperforms state-of-the-art methods for scribble-based weakly supervised segmentation.
arXiv Detail & Related papers (2024-12-25T01:52:01Z)
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference [27.551367463011008]
Cross-correlation of self-attention in CLIP's non-final layers also exhibits localization properties. We propose the Residual Cross-correlation Self-attention (RCS) module, which leverages the cross-correlation self-attention from intermediate layers to remold the attention in the final block. The RCS module effectively reorganizes spatial information, unleashing the localization potential within CLIP for dense vision-language inference.
arXiv Detail & Related papers (2024-11-24T14:14:14Z)
Unifying and Personalizing Weakly-supervised Federated Medical Image Segmentation via Adaptive Representation and Aggregation [1.121358474059223]
Federated learning (FL) enables multiple sites to collaboratively train powerful deep models without compromising data privacy and security. Weakly supervised segmentation, which uses sparsely-grained supervision, is increasingly being paid attention to due to its great potential of reducing annotation costs. We propose a novel personalized FL framework for medical image segmentation, named FedICRA, which uniformly leverages heterogeneous weak supervision.
arXiv Detail & Related papers (2023-04-12T06:32:08Z)
Data Augmentation-free Unsupervised Learning for 3D Point Cloud Understanding [61.30276576646909]
We propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu. We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task.
arXiv Detail & Related papers (2022-10-06T10:18:16Z)
FV-UPatches: Enhancing Universality in Finger Vein Recognition [0.6299766708197883]
We propose a universal learning-based framework, which achieves generalization while training with limited data. The proposed framework shows application potential in other vein-based biometric recognition as well.
arXiv Detail & Related papers (2022-06-02T14:20:22Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Semi-supervised Left Atrium Segmentation with Mutual Consistency Training [60.59108570938163]
We propose a novel Mutual Consistency Network (MC-Net) for semi-supervised left atrium segmentation from 3D MR images. Our MC-Net consists of one encoder and two slightly different decoders, and the prediction discrepancies of two decoders are transformed as an unsupervised loss. We evaluate our MC-Net on the public Left Atrium (LA) database and it obtains impressive performance gains by exploiting the unlabeled data effectively.
arXiv Detail & Related papers (2021-03-04T09:34:32Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)
Learning and Exploiting Interclass Visual Correlations for Medical Image Classification [30.88175218665726]
We present the Class-Correlation Learning Network (CCL-Net) to learn interclass visual correlations from given training data. Instead of letting the network directly learn the desired correlations, we propose to learn them implicitly via distance metric learning of class-specific embeddings. An intuitive loss based on a geometrical explanation of correlation is designed for bolstering learning of the interclass correlations.
arXiv Detail & Related papers (2020-07-13T13:31:38Z)
Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.