A Weakly Supervised Convolutional Network for Change Segmentation and
Classification
- URL: http://arxiv.org/abs/2011.03577v1
- Date: Fri, 6 Nov 2020 20:20:45 GMT
- Title: A Weakly Supervised Convolutional Network for Change Segmentation and
Classification
- Authors: Philipp Andermatt, Radu Timofte
- Abstract summary: We present W-CDNet, a novel weakly supervised change detection network that can be trained with image-level semantic labels.
W-CDNet can be trained with two different types of datasets, either containing changed image pairs only or a mixture of changed and unchanged image pairs.
- Score: 91.3755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully supervised change detection methods require difficult to procure
pixel-level labels, while weakly supervised approaches can be trained with
image-level labels. However, most of these approaches require a combination of
changed and unchanged image pairs for training. Thus, these methods can not
directly be used for datasets where only changed image pairs are available. We
present W-CDNet, a novel weakly supervised change detection network that can be
trained with image-level semantic labels. Additionally, W-CDNet can be trained
with two different types of datasets, either containing changed image pairs
only or a mixture of changed and unchanged image pairs. Since we use
image-level semantic labels for training, we simultaneously create a change
mask and label the changed object for single-label images. W-CDNet employs a
W-shaped siamese U-net to extract feature maps from an image pair which then
get compared in order to create a raw change mask. The core part of our model,
the Change Segmentation and Classification (CSC) module, learns an accurate
change mask at a hidden layer by using a custom Remapping Block and then
segmenting the current input image with the change mask. The segmented image is
used to predict the image-level semantic label. The correct label can only be
predicted if the change mask actually marks relevant change. This forces the
model to learn an accurate change mask. We demonstrate the segmentation and
classification performance of our approach and achieve top results on AICD and
HRSCD, two public aerial imaging change detection datasets as well as on a Food
Waste change detection dataset. Our code is available at
https://github.com/PhiAbs/W-CDNet .
Related papers
- MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification [29.15203530375882]
Change (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature.
We propose MaskCD to detect changed areas by adaptively generating categorized masks from input image pairs.
It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals.
arXiv Detail & Related papers (2024-04-18T11:05:15Z) - Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning [28.3763053922823]
Methods for Remote Sensing Image Change Captioning (RSICC) perform well in simple scenes but exhibit poorer performance in complex scenes.
We believe pixel-level CD is significant for describing the differences between images through language.
Our method achieves state-of-the-art performance and validate that learning pixel-level CD pseudo-labels significantly contributes to change captioning.
arXiv Detail & Related papers (2023-12-23T17:58:48Z) - Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection [9.229278131265124]
Weakly-supervised change detection (WSCD) aims to detect pixel-level changes with only image-level annotations.
We propose two components: a Dilated Prior (DP) decoder and a Label Gated (LG) constraint.
Our proposed TransWCD and TransWCD-DL achieve significant +6.33% and +9.55% F1 score improvements over the state-of-the-art methods on the WHU-CD dataset.
arXiv Detail & Related papers (2023-07-20T13:16:10Z) - Distilling Self-Supervised Vision Transformers for Weakly-Supervised
Few-Shot Classification & Segmentation [58.03255076119459]
We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT)
Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions.
Experiments on Pascal-5i and COCO-20i demonstrate significant performance gains in a variety of supervision settings.
arXiv Detail & Related papers (2023-07-07T06:16:43Z) - ConvMAE: Masked Convolution Meets Masked Autoencoders [65.15953258300958]
Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT.
Our ConvMAE framework demonstrates that multi-scale hybrid convolution-transformer can learn more discriminative representations via the mask auto-encoding scheme.
Based on our pretrained ConvMAE models, ConvMAE-Base improves ImageNet-1K finetuning accuracy by 1.4% compared with MAE-Base.
arXiv Detail & Related papers (2022-05-08T15:12:19Z) - DSNet: A Dual-Stream Framework for Weakly-Supervised Gigapixel Pathology
Image Analysis [78.78181964748144]
We present a novel weakly-supervised framework for classifying whole slide images (WSIs)
WSIs are commonly processed by patch-wise classification with patch-level labels.
With image-level labels only, patch-wise classification would be sub-optimal due to inconsistency between the patch appearance and image-level label.
arXiv Detail & Related papers (2021-09-13T09:10:43Z) - Box-Adapt: Domain-Adaptive Medical Image Segmentation using Bounding
BoxSupervision [52.45336255472669]
We propose a weakly supervised do-main adaptation setting for deep learning.
Box-Adapt fully explores the fine-grained segmenta-tion mask in the source domain and the weak bounding box in the target domain.
We demonstrate the effectiveness of our method in the liver segmentation task.
arXiv Detail & Related papers (2021-08-19T01:51:04Z) - General Multi-label Image Classification with Transformers [30.58248625606648]
We propose the Classification Transformer (C-Tran) to exploit the complex dependencies among visual features and labels.
A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels.
Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome.
arXiv Detail & Related papers (2020-11-27T23:20:35Z) - Unsupervised Self-training Algorithm Based on Deep Learning for Optical
Aerial Images Change Detection [17.232244800511523]
We present a novel unsupervised self-training algorithm (USTA) for optical aerial images change detection.
The whole process of the algorithm is an unsupervised process without manually marked labels.
Experimental results on the real datasets demonstrate competitive performance of our proposed method.
arXiv Detail & Related papers (2020-10-15T01:51:46Z) - RGB-based Semantic Segmentation Using Self-Supervised Depth Pre-Training [77.62171090230986]
We propose an easily scalable and self-supervised technique that can be used to pre-train any semantic RGB segmentation method.
In particular, our pre-training approach makes use of automatically generated labels that can be obtained using depth sensors.
We show how our proposed self-supervised pre-training with HN-labels can be used to replace ImageNet pre-training.
arXiv Detail & Related papers (2020-02-06T11:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.