The Change You Want to See
- URL: http://arxiv.org/abs/2209.14341v1
- Date: Wed, 28 Sep 2022 18:10:09 GMT
- Title: The Change You Want to See
- Authors: Ragav Sachdeva, Andrew Zisserman
- Abstract summary: Given two images of the same scene, being able to automatically detect the changes in them has practical applications in a variety of domains.
We tackle the change detection problem with the goal of detecting "object-level" changes in an image pair despite differences in their viewpoint and illumination.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We live in a dynamic world where things change all the time. Given two images
of the same scene, being able to automatically detect the changes in them has
practical applications in a variety of domains. In this paper, we tackle the
change detection problem with the goal of detecting "object-level" changes in
an image pair despite differences in their viewpoint and illumination. To this
end, we make the following four contributions: (i) we propose a scalable
methodology for obtaining a large-scale change detection training dataset by
leveraging existing object segmentation benchmarks; (ii) we introduce a
co-attention based novel architecture that is able to implicitly determine
correspondences between an image pair and find changes in the form of bounding
box predictions; (iii) we contribute four evaluation datasets that cover a
variety of domains and transformations, including synthetic image changes, real
surveillance images of a 3D scene, and synthetic 3D scenes with camera motion;
(iv) we evaluate our model on these four datasets and demonstrate zero-shot and
beyond training transformation generalization.
Related papers
- Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms [27.882122236282054]
We present a novel method for scene change detection that leverages the robust feature extraction capabilities of a visual foundational model, DINOv2.
We evaluate our approach on two benchmark datasets, VL-CMU-CD and PSCD, along with their viewpoint-varied versions.
Our experiments demonstrate significant improvements in F1-score, particularly in scenarios involving geometric changes between image pairs.
arXiv Detail & Related papers (2024-09-25T11:55:27Z) - Zero-Shot Scene Change Detection [14.095215136905553]
Our method takes advantage of the change detection effect of the tracking model by inputting reference and query images instead of consecutive frames.
We extend our approach to video to exploit rich temporal information, enhancing scene change detection performance.
arXiv Detail & Related papers (2024-06-17T05:03:44Z) - Cross-domain and Cross-dimension Learning for Image-to-Graph
Transformers [50.576354045312115]
Direct image-to-graph transformation is a challenging task that solves object detection and relationship prediction in a single model.
We introduce a set of methods enabling cross-domain and cross-dimension transfer learning for image-to-graph transformers.
We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we pretrain our models on 2D satellite images before applying them to vastly different target domains in 2D and 3D.
arXiv Detail & Related papers (2024-03-11T10:48:56Z) - ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes [64.57705752579207]
We evaluate the resilience of vision-based models against diverse object-to-background context variations.
We harness the generative capabilities of text-to-image, image-to-text, and image-to-segment models to automatically generate object-to-background changes.
arXiv Detail & Related papers (2024-03-07T17:48:48Z) - Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation
Learning of Vision-based Autonomous Driving [73.3702076688159]
We propose a novel contrastive learning algorithm, Cohere3D, to learn coherent instance representations in a long-term input sequence.
We evaluate our algorithm by finetuning the pretrained model on various downstream perception, prediction, and planning tasks.
arXiv Detail & Related papers (2024-02-23T19:43:01Z) - Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments [20.890476387720483]
MoRE is a novel approach for multi-object relocalization and reconstruction in evolving environments.
We view these environments as "living scenes" and consider the problem of transforming scans taken at different points in time into a 3D reconstruction of the object instances.
arXiv Detail & Related papers (2023-12-14T17:09:57Z) - The Change You Want to See (Now in 3D) [65.61789642291636]
The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene.
We contribute a change detection model that is trained entirely on synthetic data and is class-agnostic.
We release a new evaluation dataset consisting of real-world image pairs with human-annotated differences.
arXiv Detail & Related papers (2023-08-21T01:59:45Z) - Learning Transformations To Reduce the Geometric Shift in Object
Detection [60.20931827772482]
We tackle geometric shifts emerging from variations in the image capture process.
We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts.
We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change.
arXiv Detail & Related papers (2023-01-13T11:55:30Z) - Supervising Remote Sensing Change Detection Models with 3D Surface
Semantics [1.8782750537161614]
We propose Contrastive Surface-Image Pretraining (CSIP) for joint learning using optical RGB and above ground level (AGL) map pairs.
We then evaluate these pretrained models on several building segmentation and change detection datasets to show that our method does, in fact, extract features relevant to downstream applications.
arXiv Detail & Related papers (2022-02-26T23:35:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.