PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation
- URL: http://arxiv.org/abs/2011.12640v1
- Date: Wed, 25 Nov 2020 11:03:11 GMT
- Title: PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation
- Authors: Yutong Xie, Jianpeng Zhang, Zehui Liao, Yong Xia, and Chunhua Shen
- Abstract summary: We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
- Score: 87.50205728818601
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It has been widely recognized that the success of deep learning in image
segmentation relies overwhelmingly on a myriad amount of densely annotated
training data, which, however, are difficult to obtain due to the tremendous
labor and expertise required, particularly for annotating 3D medical images.
Although self-supervised learning (SSL) has shown great potential to address
this issue, most SSL approaches focus only on image-level global consistency,
but ignore the local consistency which plays a pivotal role in capturing
structural information for dense prediction tasks such as segmentation. In this
paper, we propose a PriorGuided Local (PGL) self-supervised model that learns
the region-wise local consistency in the latent feature space. Specifically, we
use the spatial transformations, which produce different augmented views of the
same image, as a prior to deduce the location relation between two views, which
is then used to align the feature maps of the same local region but being
extracted on two views. Next, we construct a local consistency loss to minimize
the voxel-wise discrepancy between the aligned feature maps. Thus, our PGL
model learns the distinctive representations of local regions, and hence is
able to retain structural information. This ability is conducive to downstream
segmentation tasks. We conducted an extensive evaluation on four public
computerized tomography (CT) datasets that cover 11 kinds of major human organs
and two tumors. The results indicate that using pre-trained PGL model to
initialize a downstream network leads to a substantial performance improvement
over both random initialization and the initialization with global
consistency-based models. Code and pre-trained weights will be made available
at: https://git.io/PGL.
Related papers
- Locality Alignment Improves Vision-Language Models [55.275235524659905]
Vision language models (VLMs) have seen growing adoption in recent years, but many still struggle with basic spatial reasoning errors.
We propose a new efficient post-training stage for ViTs called locality alignment.
We show that locality-aligned backbones improve performance across a range of benchmarks.
arXiv Detail & Related papers (2024-10-14T21:01:01Z) - Keypoint-Augmented Self-Supervised Learning for Medical Image
Segmentation with Limited Annotation [21.203307064937142]
We present a keypointaugmented fusion layer that extracts representations preserving both short- and long-range self-attention.
In particular, we augment the CNN feature map at multiple scales by incorporating an additional input that learns long-range spatial selfattention.
Our method further outperforms existing SSL methods by producing more robust self-attention.
arXiv Detail & Related papers (2023-10-02T22:31:30Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Conditioning Covert Geo-Location (CGL) Detection on Semantic Class
Information [5.660207256468971]
Task for identification of potential hideouts termed Covert Geo-Location (CCGL) detection was proposed by Saha et al.
No attempts were made to utilize semantic class information, which is crucial for obscured detection.
In this paper, we propose a multitask-learning-based approach to achieve 2 goals - i) extraction of features having semantic class information; ii) robust training of the common encoder, exploiting large standard annotated datasets as training set for the auxiliary task (semantic segmentation).
arXiv Detail & Related papers (2022-11-27T07:21:59Z) - Keep Your Friends Close & Enemies Farther: Debiasing Contrastive
Learning with Spatial Priors in 3D Radiology Images [11.251405818285331]
We propose a 3D contrastive framework (Spade) that leverages extracted correspondences to select more effective positive & negative samples for representation learning.
Compared to recent state-of-the-art approaches, Spade shows notable improvements on three downstream segmentation tasks.
arXiv Detail & Related papers (2022-11-16T03:36:06Z) - PA-Seg: Learning from Point Annotations for 3D Medical Image
Segmentation using Contextual Regularization and Cross Knowledge Distillation [14.412073730567137]
We propose to annotate a segmentation target with only seven points in 3D medical images, and design a two-stage weakly supervised learning framework PA-Seg.
In the first stage, we employ geodesic distance transform to expand the seed points to provide more supervision signal.
In the second stage, we use predictions obtained by the model pre-trained in the first stage as pseudo labels.
arXiv Detail & Related papers (2022-08-11T07:00:33Z) - Learning Where to Learn in Cross-View Self-Supervised Learning [54.14989750044489]
Self-supervised learning (SSL) has made enormous progress and largely narrowed the gap with supervised ones.
Current methods simply adopt uniform aggregation of pixels for embedding.
We present a new approach, Learning Where to Learn (LEWEL), to adaptively aggregate spatial information of features.
arXiv Detail & Related papers (2022-03-28T17:02:42Z) - Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features.
The target model aims to mimic the local structure of the source representation space using a contrastive loss.
CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z) - Self-Supervised Learning for Fine-Grained Visual Categorization [0.0]
We study the usefulness of SSL for Fine-Grained Visual Categorization (FGVC)
FGVC aims to distinguish objects of visually similar sub categories within a general category.
Our baseline achieves $86.36%$ top-1 classification accuracy on CUB-200-2011 dataset.
arXiv Detail & Related papers (2021-05-18T19:16:05Z) - Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization [54.00111565818903]
Cross-view geo-localization is to spot images of the same geographic target from different platforms.
Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center.
We introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information.
arXiv Detail & Related papers (2020-08-26T16:06:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.