End-to-End Trainable Deep Active Contour Models for Automated Image
Segmentation: Delineating Buildings in Aerial Imagery
- URL: http://arxiv.org/abs/2007.11691v1
- Date: Wed, 22 Jul 2020 21:27:17 GMT
- Title: End-to-End Trainable Deep Active Contour Models for Automated Image
Segmentation: Delineating Buildings in Aerial Imagery
- Authors: Ali Hatamizadeh, Debleena Sengupta, Demetri Terzopoulos
- Abstract summary: Trainable Deep Active Contours (TDACs) is an automatic image segmentation framework that unites Convolutional Networks (CNNs) and Active Contour Models (ACMs)
TDAC yields fast, accurate, and fully automatic simultaneous delineation of arbitrarily many buildings in the image.
- Score: 12.442780294349049
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The automated segmentation of buildings in remote sensing imagery is a
challenging task that requires the accurate delineation of multiple building
instances over typically large image areas. Manual methods are often laborious
and current deep-learning-based approaches fail to delineate all building
instances and do so with adequate accuracy. As a solution, we present Trainable
Deep Active Contours (TDACs), an automatic image segmentation framework that
intimately unites Convolutional Neural Networks (CNNs) and Active Contour
Models (ACMs). The Eulerian energy functional of the ACM component includes
per-pixel parameter maps that are predicted by the backbone CNN, which also
initializes the ACM. Importantly, both the ACM and CNN components are fully
implemented in TensorFlow and the entire TDAC architecture is end-to-end
automatically differentiable and backpropagation trainable without user
intervention. TDAC yields fast, accurate, and fully automatic simultaneous
delineation of arbitrarily many buildings in the image. We validate the model
on two publicly available aerial image datasets for building segmentation, and
our results demonstrate that TDAC establishes a new state-of-the-art
performance.
Related papers
- RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model [22.56227565913003]
We propose a comprehensive remote sensing image building model, termed RSBuilding, developed from the perspective of the foundation model.
RSBuilding is designed to enhance cross-scene generalization and task understanding.
Our model was trained on a dataset comprising up to 245,000 images and validated on multiple building extraction and change detection datasets.
arXiv Detail & Related papers (2024-03-12T11:51:59Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Efficient Semantic Segmentation on Edge Devices [7.5562201794440185]
This project analyzes current semantic segmentation models to explore the feasibility of applying these models for emergency response during catastrophic events.
We compare the performance of real-time semantic segmentation models with non-real-time counterparts constrained by aerial images under oppositional settings.
Furthermore, we train several models on the Flood-Net dataset, containing UAV images captured after Hurricane Harvey, and benchmark their execution on special classes such as flooded buildings vs. non-flooded buildings or flooded roads vs. non-flooded roads.
arXiv Detail & Related papers (2022-12-28T04:13:11Z) - CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware
Training [112.96224800952724]
We propose cascaded modulation GAN (CM-GAN) to generate plausible image structures when dealing with large holes in complex images.
In each decoder block, global modulation is first applied to perform coarse semantic-aware synthesis structure, then spatial modulation is applied on the output of global modulation to further adjust the feature map in a spatially adaptive fashion.
In addition, we design an object-aware training scheme to prevent the network from hallucinating new objects inside holes, fulfilling the needs of object removal tasks in real-world scenarios.
arXiv Detail & Related papers (2022-03-22T16:13:27Z) - DAFormer: Improving Network Architectures and Training Strategies for
Domain-Adaptive Semantic Segmentation [99.88539409432916]
We study the unsupervised domain adaptation (UDA) process.
We propose a novel UDA method, DAFormer, based on the benchmark results.
DAFormer significantly improves the state-of-the-art performance by 10.8 mIoU for GTA->Cityscapes and 5.4 mIoU for Synthia->Cityscapes.
arXiv Detail & Related papers (2021-11-29T19:00:46Z) - Contextual Pyramid Attention Network for Building Segmentation in Aerial
Imagery [12.241693880896348]
Building extraction from aerial images has several applications in problems such as urban planning, change detection, and disaster management.
We propose to improve building segmentation of different sizes by capturing long-range dependencies using contextual pyramid attention (CPA)
Our method improves 1.8 points over current state-of-the-art methods and 12.6 points higher than existing baselines without any post-processing.
arXiv Detail & Related papers (2020-04-15T11:36:26Z) - Concurrently Extrapolating and Interpolating Networks for Continuous
Model Generation [34.72650269503811]
We propose a simple yet effective model generation strategy to form a sequence of models that only requires a set of specific-effect label images.
We show that the proposed method is capable of producing a series of continuous models and achieves better performance than that of several state-of-the-art methods for image smoothing.
arXiv Detail & Related papers (2020-01-12T04:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.