Differencing based Self-supervised pretraining for Scene Change
Detection
- URL: http://arxiv.org/abs/2208.05838v1
- Date: Thu, 11 Aug 2022 14:06:32 GMT
- Title: Differencing based Self-supervised pretraining for Scene Change
Detection
- Authors: Vijaya Raghavan T. Ramkumar, Elahe Arani, Bahram Zonooz
- Abstract summary: Scene change detection (SCD) identifies changes by comparing scenes captured at different times.
Deep neural network based solutions require a large quantity of annotated data which is tedious and expensive to obtain.
We propose a novel textitDifferencing self-supervised pretraining (DSP) method that uses feature differencing to learn discriminatory representations.
- Score: 12.525959293825318
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Scene change detection (SCD), a crucial perception task, identifies changes
by comparing scenes captured at different times. SCD is challenging due to
noisy changes in illumination, seasonal variations, and perspective differences
across a pair of views. Deep neural network based solutions require a large
quantity of annotated data which is tedious and expensive to obtain. On the
other hand, transfer learning from large datasets induces domain shift. To
address these challenges, we propose a novel \textit{Differencing
self-supervised pretraining (DSP)} method that uses feature differencing to
learn discriminatory representations corresponding to the changed regions while
simultaneously tackling the noisy changes by enforcing temporal invariance
across views. Our experimental results on SCD datasets demonstrate the
effectiveness of our method, specifically to differences in camera viewpoints
and lighting conditions. Compared against the self-supervised Barlow Twins and
the standard ImageNet pretraining that uses more than a million additional
labeled images, DSP can surpass it without using any additional data. Our
results also demonstrate the robustness of DSP to natural corruptions,
distribution shift, and learning under limited labeled data.
Related papers
- ZeroSCD: Zero-Shot Street Scene Change Detection [2.3020018305241337]
Scene Change Detection is a challenging task in computer vision and robotics.
Traditional change detection methods rely on training models that take these image pairs as input and estimate the changes.
We propose ZeroSCD, a zero-shot scene change detection framework that eliminates the need for training.
arXiv Detail & Related papers (2024-09-23T17:53:44Z) - BD-MSA: Body decouple VHR Remote Sensing Image Change Detection method
guided by multi-scale feature information aggregation [4.659935767219465]
The purpose of remote sensing image change detection (RSCD) is to detect differences between bi-temporal images taken at the same place.
Deep learning has been extensively used to RSCD tasks, yielding significant results in terms of result recognition.
arXiv Detail & Related papers (2024-01-09T02:53:06Z) - Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing [58.48890547818074]
We present a powerful modification of Contrastive Denoising Score (CUT) for latent diffusion models (LDM)
Our approach enables zero-shot imageto-image translation and neural field (NeRF) editing, achieving structural correspondence between the input and output.
arXiv Detail & Related papers (2023-11-30T15:06:10Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Deep Metric Learning for Unsupervised Remote Sensing Change Detection [60.89777029184023]
Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs)
The performance of existing RS-CD methods is attributed to training on large annotated datasets.
This paper proposes an unsupervised CD method based on deep metric learning that can deal with both of these issues.
arXiv Detail & Related papers (2023-03-16T17:52:45Z) - Sketched Multi-view Subspace Learning for Hyperspectral Anomalous Change
Detection [12.719327447589345]
A sketched multi-view subspace learning model is proposed for anomalous change detection.
The proposed model preserves major information from the image pairs and improves computational complexity.
experiments are conducted on a benchmark hyperspectral remote sensing dataset and a natural hyperspectral dataset.
arXiv Detail & Related papers (2022-10-09T14:08:17Z) - Semantic-aware Dense Representation Learning for Remote Sensing Image
Change Detection [20.761672725633936]
Training deep learning-based change detection model heavily depends on labeled data.
Recent trend is using remote sensing (RS) data to obtain in-domain representations via supervised or self-supervised learning (SSL)
We propose dense semantic-aware pre-training for RS image CD via sampling multiple class-balanced points.
arXiv Detail & Related papers (2022-05-27T06:08:33Z) - Revisiting Consistency Regularization for Semi-supervised Change
Detection in Remote Sensing Images [60.89777029184023]
We propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss.
Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD.
arXiv Detail & Related papers (2022-04-18T17:59:01Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.