The Change You Want to See
- URL: http://arxiv.org/abs/2209.14341v1
- Date: Wed, 28 Sep 2022 18:10:09 GMT
- Title: The Change You Want to See
- Authors: Ragav Sachdeva, Andrew Zisserman
- Abstract summary: Given two images of the same scene, being able to automatically detect the changes in them has practical applications in a variety of domains.
We tackle the change detection problem with the goal of detecting "object-level" changes in an image pair despite differences in their viewpoint and illumination.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We live in a dynamic world where things change all the time. Given two images
of the same scene, being able to automatically detect the changes in them has
practical applications in a variety of domains. In this paper, we tackle the
change detection problem with the goal of detecting "object-level" changes in
an image pair despite differences in their viewpoint and illumination. To this
end, we make the following four contributions: (i) we propose a scalable
methodology for obtaining a large-scale change detection training dataset by
leveraging existing object segmentation benchmarks; (ii) we introduce a
co-attention based novel architecture that is able to implicitly determine
correspondences between an image pair and find changes in the form of bounding
box predictions; (iii) we contribute four evaluation datasets that cover a
variety of domains and transformations, including synthetic image changes, real
surveillance images of a 3D scene, and synthetic 3D scenes with camera motion;
(iv) we evaluate our model on these four datasets and demonstrate zero-shot and
beyond training transformation generalization.
Related papers
- ViewDelta: Text-Prompted Change Detection in Unaligned Images [0.0]
We propose a novel change detection method that is the first to utilize unaligned images and textual prompts to output a binary segmentation of changes relevant to user-provided text.
Our architecture not only enables flexible detection across diverse change detection use cases, but also yields state-of-the art performance on established benchmarks.
arXiv Detail & Related papers (2024-12-10T15:51:17Z) - Multi-View Pose-Agnostic Change Localization with Zero Labels [4.997375878454274]
We propose a label-free, pose-agnostic change detection method that integrates information from multiple viewpoints.
With as few as 5 images of the post-change scene, our approach can learn additional change channels in a 3DGS.
Our change-aware 3D scene representation additionally enables the generation of accurate change masks for unseen viewpoints.
arXiv Detail & Related papers (2024-12-05T06:28:54Z) - Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms [27.882122236282054]
We present a novel method for scene change detection that leverages the robust feature extraction capabilities of a visual foundational model, DINOv2.
We evaluate our approach on two benchmark datasets, VL-CMU-CD and PSCD, along with their viewpoint-varied versions.
Our experiments demonstrate significant improvements in F1-score, particularly in scenarios involving geometric changes between image pairs.
arXiv Detail & Related papers (2024-09-25T11:55:27Z) - Zero-Shot Scene Change Detection [14.095215136905553]
Our method takes advantage of the change detection effect of the tracking model by inputting reference and query images instead of consecutive frames.
We extend our approach to video, leveraging rich temporal information to enhance the performance of scene change detection.
arXiv Detail & Related papers (2024-06-17T05:03:44Z) - Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers [48.74331852418905]
Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model.
Due to this task's complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging.
We introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers.
arXiv Detail & Related papers (2024-03-11T10:48:56Z) - ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes [64.57705752579207]
We evaluate the resilience of vision-based models against diverse object-to-background context variations.
We harness the generative capabilities of text-to-image, image-to-text, and image-to-segment models to automatically generate object-to-background changes.
arXiv Detail & Related papers (2024-03-07T17:48:48Z) - Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation
Learning of Vision-based Autonomous Driving [73.3702076688159]
We propose a novel contrastive learning algorithm, Cohere3D, to learn coherent instance representations in a long-term input sequence.
We evaluate our algorithm by finetuning the pretrained model on various downstream perception, prediction, and planning tasks.
arXiv Detail & Related papers (2024-02-23T19:43:01Z) - The Change You Want to See (Now in 3D) [65.61789642291636]
The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene.
We contribute a change detection model that is trained entirely on synthetic data and is class-agnostic.
We release a new evaluation dataset consisting of real-world image pairs with human-annotated differences.
arXiv Detail & Related papers (2023-08-21T01:59:45Z) - Learning Transformations To Reduce the Geometric Shift in Object
Detection [60.20931827772482]
We tackle geometric shifts emerging from variations in the image capture process.
We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts.
We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change.
arXiv Detail & Related papers (2023-01-13T11:55:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.