ViewDelta: Text-Prompted Change Detection in Unaligned Images
- URL: http://arxiv.org/abs/2412.07612v1
- Date: Tue, 10 Dec 2024 15:51:17 GMT
- Title: ViewDelta: Text-Prompted Change Detection in Unaligned Images
- Authors: Subin Varghese, Joshua Gao, Vedhus Hoskere,
- Abstract summary: We propose a novel change detection method that is the first to utilize unaligned images and textual prompts to output a binary segmentation of changes relevant to user-provided text.
Our architecture not only enables flexible detection across diverse change detection use cases, but also yields state-of-the art performance on established benchmarks.
- Score: 0.0
- License:
- Abstract: Detecting changes between images is a fundamental problem in computer vision with broad applications in situational awareness, infrastructure assessment, environment monitoring, and industrial automation. Existing supervised models are typically limited to detecting specific types of changes, necessitating retraining for new tasks. To address these limitations with a single approach, we propose a novel change detection method that is the first to utilize unaligned images and textual prompts to output a binary segmentation of changes relevant to user-provided text. Our architecture not only enables flexible detection across diverse change detection use cases, but also yields state-of-the art performance on established benchmarks. Additionally, we release an accompanying dataset comprising of 100,311 pairs of images with text prompts and the corresponding change detection labels. We demonstrate the effectiveness of our method both quantitatively and qualitatively on datasets with a wide variety of viewpoints in indoor, outdoor, street level, synthetic, and satellite images.
Related papers
- Show Me What and Where has Changed? Question Answering and Grounding for Remote Sensing Change Detection [82.65760006883248]
We introduce a new task named Change Detection Question Answering and Grounding (CDQAG)
CDQAG extends the traditional change detection task by providing interpretable textual answers and intuitive visual evidence.
We construct the first CDQAG benchmark dataset, termed QAG-360K, comprising over 360K triplets of questions, textual answers, and corresponding high-quality visual masks.
arXiv Detail & Related papers (2024-10-31T11:20:13Z) - ZeroSCD: Zero-Shot Street Scene Change Detection [2.3020018305241337]
Scene Change Detection is a challenging task in computer vision and robotics.
Traditional change detection methods rely on training models that take these image pairs as input and estimate the changes.
We propose ZeroSCD, a zero-shot scene change detection framework that eliminates the need for training.
arXiv Detail & Related papers (2024-09-23T17:53:44Z) - Zero-Shot Scene Change Detection [14.095215136905553]
Our method takes advantage of the change detection effect of the tracking model by inputting reference and query images instead of consecutive frames.
We extend our approach to video, leveraging rich temporal information to enhance the performance of scene change detection.
arXiv Detail & Related papers (2024-06-17T05:03:44Z) - Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - Self-Pair: Synthesizing Changes from Single Source for Object Change
Detection in Remote Sensing Imagery [6.586756080460231]
We train a change detector using two spatially unrelated images with corresponding semantic labels such as building.
We show that manipulating the source image as an after-image is crucial to the performance of change detection.
Our method outperforms existing methods based on single-temporal supervision.
arXiv Detail & Related papers (2022-12-20T13:26:42Z) - The Change You Want to See [91.3755431537592]
Given two images of the same scene, being able to automatically detect the changes in them has practical applications in a variety of domains.
We tackle the change detection problem with the goal of detecting "object-level" changes in an image pair despite differences in their viewpoint and illumination.
arXiv Detail & Related papers (2022-09-28T18:10:09Z) - Simple Open-Vocabulary Object Detection with Vision Transformers [51.57562920090721]
We propose a strong recipe for transferring image-text models to open-vocabulary object detection.
We use a standard Vision Transformer architecture with minimal modifications, contrastive image-text pre-training, and end-to-end detection fine-tuning.
We provide the adaptation strategies and regularizations needed to attain very strong performance on zero-shot text-conditioned and one-shot image-conditioned object detection.
arXiv Detail & Related papers (2022-05-12T17:20:36Z) - ObjectFormer for Image Manipulation Detection and Localization [118.89882740099137]
We propose ObjectFormer to detect and localize image manipulations.
We extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings.
We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-03-28T12:27:34Z) - Frugal Learning of Virtual Exemplars for Label-Efficient Satellite Image
Change Detection [12.18340575383456]
In this paper, we devise a novel interactive satellite image change detection algorithm based on active learning.
The proposed framework is iterative and relies on a question and answer model which asks the oracle (user) questions about the most informative display.
The contribution of our framework resides in a novel display model which selects the most representative and diverse virtual exemplars.
arXiv Detail & Related papers (2022-03-22T09:29:42Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Unsupervised Change Detection in Satellite Images with Generative
Adversarial Network [20.81970476609318]
We propose a novel change detection framework utilizing a special neural network architecture -- Generative Adversarial Network (GAN) to generate better coregistered images.
The optimized GAN model would produce better coregistered images where changes can be easily spotted and then the change map can be presented through a comparison strategy.
arXiv Detail & Related papers (2020-09-08T10:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.