Related papers: TINYCD: A (Not So) Deep Learning Model For Change Detection

TINYCD: A (Not So) Deep Learning Model For Change Detection

URL: http://arxiv.org/abs/2207.13159v1
Date: Tue, 26 Jul 2022 19:28:48 GMT
Title: TINYCD: A (Not So) Deep Learning Model For Change Detection
Authors: Andrea Codegoni, Gabriele Lombardi and Alessandro Ferrari
Abstract summary: The aim of change detection (CD) is to detect changes occurred in the same area by comparing two images of that place taken at different times. Recent developments in the field of deep learning enabled researchers to achieve outstanding performance in this area. We propose a novel model, called TinyCD, demonstrating to be both lightweight and effective.
Score: 68.8204255655161
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The aim of change detection (CD) is to detect changes occurred in the same area by comparing two images of that place taken at different times. The challenging part of the CD is to keep track of the changes the user wants to highlight, such as new buildings, and to ignore changes due to external factors such as environmental, lighting condition, fog or seasonal changes. Recent developments in the field of deep learning enabled researchers to achieve outstanding performance in this area. In particular, different mechanisms of space-time attention allowed to exploit the spatial features that are extracted from the models and to correlate them also in a temporal way by exploiting both the available images. The downside is that the models have become increasingly complex and large, often unfeasible for edge applications. These are limitations when the models must be applied to the industrial field or in applications requiring real-time performances. In this work we propose a novel model, called TinyCD, demonstrating to be both lightweight and effective, able to achieve performances comparable or even superior to the current state of the art with 13-150X fewer parameters. In our approach we have exploited the importance of low-level features to compare images. To do this, we use only few backbone blocks. This strategy allow us to keep the number of network parameters low. To compose the features extracted from the two images, we introduce a novel, economical in terms of parameters, mixing block capable of cross correlating features in both space and time domains. Finally, to fully exploit the information contained in the computed features, we define the PW-MLP block able to perform a pixel wise classification. Source code, models and results are available here: https://github.com/AndreaCodegoni/Tiny_model_4_CD

Related papers

How to Squeeze An Explanation Out of Your Model [13.154512864498912]
This paper proposes an approach for interpretability that is model-agnostic. By including an SE block prior to the classification layer of any model, we are able to retrieve the most influential features. Results show that this new SE-based interpretability can be applied to various models in image and video/multi-modal settings.
arXiv Detail & Related papers (2024-12-06T15:47:53Z)
Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms [27.882122236282054]
We present a novel method for scene change detection that leverages the robust feature extraction capabilities of a visual foundational model, DINOv2. We evaluate our approach on two benchmark datasets, VL-CMU-CD and PSCD, along with their viewpoint-varied versions. Our experiments demonstrate significant improvements in F1-score, particularly in scenarios involving geometric changes between image pairs.
arXiv Detail & Related papers (2024-09-25T11:55:27Z)
Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model [62.337749660637755]
We present change data generators based on generative models which are cheap and automatic. Changen2 is a generative change foundation model that can be trained at scale via self-supervision. The resulting model possesses inherent zero-shot change detection capabilities and excellent transferability.
arXiv Detail & Related papers (2024-06-26T01:03:39Z)
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion [35.88039888482076]
We introduce the first Differentiable Augmentation Search method (DAS) to generate variations of images that can be processed as videos. DAS is extremely fast and flexible, allowing the search on very large search spaces in less than a GPU day. We leverage DAS to guide the reshaping of the spatial receptive field by selecting task-dependant transformations.
arXiv Detail & Related papers (2024-03-22T13:27:57Z)
One-Step Image Translation with Text-to-Image Models [35.0987002313882]
We introduce a general method for adapting a single-step diffusion model to new tasks and domains through adversarial learning objectives. We consolidate various modules of the vanilla latent diffusion model into a single end-to-end generator network with small trainable weights. Our model CycleGAN-Turbo outperforms existing GAN-based and diffusion-based methods for various scene translation tasks.
arXiv Detail & Related papers (2024-03-18T17:59:40Z)
TransY-Net:Learning Fully Transformer Networks for Change Detection of Remote Sensing Images [64.63004710817239]
We propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD. It improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner. Our proposed method achieves a new state-of-the-art performance on four optical and two SAR image CD benchmarks.
arXiv Detail & Related papers (2023-10-22T07:42:19Z)
The Change You Want to See (Now in 3D) [65.61789642291636]
The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene. We contribute a change detection model that is trained entirely on synthetic data and is class-agnostic. We release a new evaluation dataset consisting of real-world image pairs with human-annotated differences.
arXiv Detail & Related papers (2023-08-21T01:59:45Z)
Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation [52.923298434948606]
Low-light conditions not only hamper human visual experience but also degrade the model's performance on downstream vision tasks. This paper challenges a more complicated scenario with border applicability, i.e., zero-shot day-night domain adaptation. We propose a similarity min-max paradigm that considers them under a unified framework.
arXiv Detail & Related papers (2023-07-17T18:50:15Z)
Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image Classification Using Transformers [0.11219061154635457]
Whole-Slide Imaging allows for the capturing and digitization of high-resolution images of histological specimen. transformer architecture has been proposed as a possible candidate for effectively leveraging the high-resolution information. We propose a novel cascaded cross-attention network (CCAN) based on the cross-attention mechanism that scales linearly with the number of extracted patches.
arXiv Detail & Related papers (2023-05-11T16:42:24Z)
PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result. Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z)
Efficient Transformer based Method for Remote Sensing Image Change Detection [17.553240434628087]
High-resolution remote sensing CD remains challenging due to the complexity of objects in the scene. We propose a bitemporal image transformer (BiT) to efficiently and effectively model contexts within the spatial-temporal domain. BiT-based model significantly outperforms the purely convolutional baseline using only 3 times lower computational costs and model parameters.
arXiv Detail & Related papers (2021-02-27T13:08:46Z)
Looking for change? Roll the Dice and demand Attention [0.0]
We propose a reliable deep learning framework for the task of semantic change detection in high-resolution aerial images. Our framework consists of a new loss function, new attention modules, new feature extraction building blocks, and a new backbone architecture. We validate our approach by showing excellent performance and achieving state of the art score (F1 and Intersection over Union-hereafter IoU) on two building change detection datasets.
arXiv Detail & Related papers (2020-09-04T08:30:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.