Related papers: Environmental Change Detection: Toward a Practical Task of Scene Change Detection

Environmental Change Detection: Toward a Practical Task of Scene Change Detection

URL: http://arxiv.org/abs/2506.11481v1
Date: Fri, 13 Jun 2025 06:09:43 GMT
Title: Environmental Change Detection: Toward a Practical Task of Scene Change Detection
Authors: Kyusik Cho, Suhan Woo, Hongje Seong, Euntai Kim,
Abstract summary: We propose a novel framework that jointly understands spatial environments and detects changes.<n>We deal with this limitation by leveraging multiple reference candidates and aggregating semantically rich representations for change detection.<n>We evaluate our framework on three standard benchmark sets reconstructed for ECD, and significantly outperform a naive combination of state-of-the-art methods.
Score: 23.79599379113436
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Humans do not memorize everything. Thus, humans recognize scene changes by exploring the past images. However, available past (i.e., reference) images typically represent nearby viewpoints of the present (i.e., query) scene, rather than the identical view. Despite this practical limitation, conventional Scene Change Detection (SCD) has been formalized under an idealized setting in which reference images with matching viewpoints are available for every query. In this paper, we push this problem toward a practical task and introduce Environmental Change Detection (ECD). A key aspect of ECD is to avoid unrealistically aligned query-reference pairs and rely solely on environmental cues. Inspired by real-world practices, we provide these cues through a large-scale database of uncurated images. To address this new task, we propose a novel framework that jointly understands spatial environments and detects changes. The main idea is that matching at the same spatial locations between a query and a reference may lead to a suboptimal solution due to viewpoint misalignment and limited field-of-view (FOV) coverage. We deal with this limitation by leveraging multiple reference candidates and aggregating semantically rich representations for change detection. We evaluate our framework on three standard benchmark sets reconstructed for ECD, and significantly outperform a naive combination of state-of-the-art methods while achieving comparable performance to the oracle setting. The code will be released upon acceptance.

Related papers

VISTA: Monocular Segmentation-Based Mapping for Appearance and View-Invariant Global Localization [0.2356141385409842]
VISTA is a novel open-set, monocular global localization framework.<n>It exploits geometric consistencies between environment maps to align reference frames.<n>We evaluate VISTA on seasonal and oblique-angle aerial datasets, achieving up to a 69% improvement in recall over baseline methods.
arXiv Detail & Related papers (2025-07-15T18:38:35Z)
Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation [15.50183955507315]
We propose a self-supervision motivated CD framework with geometric estimation, called "MatchCD"<n>The proposed MatchCD framework utilizes the zero-shot capability to optimize the encoder with self-supervised contrastive representation.<n>Unlike the conventional change detection requiring segmenting the full-frame image into small patches, our MatchCD framework can directly process the original large-scale image.
arXiv Detail & Related papers (2025-04-19T14:05:39Z)
Show Me What and Where has Changed? Question Answering and Grounding for Remote Sensing Change Detection [82.65760006883248]
We introduce a new task named Change Detection Question Answering and Grounding (CDQAG) CDQAG extends the traditional change detection task by providing interpretable textual answers and intuitive visual evidence. We construct the first CDQAG benchmark dataset, termed QAG-360K, comprising over 360K triplets of questions, textual answers, and corresponding high-quality visual masks.
arXiv Detail & Related papers (2024-10-31T11:20:13Z)
Towards Generalizable Scene Change Detection [4.527270266697462]
Current state-of-the-art Scene Change Detection approaches are unreliable under unseen environments and different temporal conditions.<n>We propose the Generalizable Scene Change Detection Framework (GeSCF) to address unseen domain performance and temporal consistency.<n>GeSCF achieves an average performance gain of 19.2% on existing SCD datasets and 30.0% on the ChangeVPR dataset, nearly doubling the prior art performance.
arXiv Detail & Related papers (2024-09-10T04:45:25Z)
Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP.<n>VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone.<n>Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z)
Are Local Features All You Need for Cross-Domain Visual Place Recognition? [13.519413608607781]
Visual Place Recognition aims to predict the coordinates of an image based solely on visual clues. Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods. In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges.
arXiv Detail & Related papers (2023-04-12T14:46:57Z)
Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts. We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query. Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z)
Robust Change Detection Based on Neural Descriptor Fields [53.111397800478294]
We develop an object-level online change detection approach that is robust to partially overlapping observations and noisy localization results. By associating objects via shape code similarity and comparing local object-neighbor spatial layout, our proposed approach demonstrates robustness to low observation overlap and localization noises.
arXiv Detail & Related papers (2022-08-01T17:45:36Z)
NovelCraft: A Dataset for Novelty Detection and Discovery in Open Worlds [14.265615838391703]
NovelCraft dataset contains episodic data of the images and symbolic world-states seen by an agent completing a pogo stick assembly task within a modified Minecraft environment. Our visual novelty detection benchmark finds that methods that rank best on popular area-under-the-curve metrics may be outperformed by simpler alternatives. Further multimodal novelty detection experiments suggest that methods that fuse both visual and symbolic information can improve time until detection as well as overall discrimination.
arXiv Detail & Related papers (2022-06-23T14:31:33Z)
Seeking Similarities over Differences: Similarity-based Domain Alignment for Adaptive Object Detection [86.98573522894961]
We propose a framework that generalizes the components commonly used by Unsupervised Domain Adaptation (UDA) algorithms for detection. Specifically, we propose a novel UDA algorithm, ViSGA, that leverages the best design choices and introduces a simple but effective method to aggregate features at instance-level. We show that both similarity-based grouping and adversarial training allows our model to focus on coarsely aligning feature groups, without being forced to match all instances across loosely aligned domains.
arXiv Detail & Related papers (2021-10-04T13:09:56Z)
Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach. We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z)
VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval [19.239311087570318]
Cross-view image geo-localization aims to determine the locations of street-view query images by matching with GPS-tagged reference images from aerial view. Recent works have achieved surprisingly high retrieval accuracy on city-scale datasets. We propose a new large-scale benchmark -- VIGOR -- for cross-View Image Geo-localization beyond One-to-one Retrieval.
arXiv Detail & Related papers (2020-11-24T15:50:54Z)
Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation [169.82760468633236]
We propose to build the pixel-level cycle association between source and target pixel pairs. Our method can be trained end-to-end in one stage and introduces no additional parameters.
arXiv Detail & Related papers (2020-10-31T00:11:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.