UV-SAM: Adapting Segment Anything Model for Urban Village Identification
- URL: http://arxiv.org/abs/2401.08083v2
- Date: Thu, 1 Feb 2024 08:05:26 GMT
- Title: UV-SAM: Adapting Segment Anything Model for Urban Village Identification
- Authors: Xin Zhang, Yu Liu, Yuming Lin, Qingmin Liao, Yong Li
- Abstract summary: Governments heavily depend on field survey methods to monitor the urban villages.
To accurately identify urban village boundaries from satellite images, we adapt the Segment Anything Model (SAM) to urban village segmentation, named UV-SAM.
UV-SAM first leverages a small-sized semantic segmentation model to produce mixed prompts for urban villages, including mask, bounding box, and image representations, which are then fed into SAM for fine-grained boundary identification.
- Score: 25.286722125746902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Urban villages, defined as informal residential areas in or around urban
centers, are characterized by inadequate infrastructures and poor living
conditions, closely related to the Sustainable Development Goals (SDGs) on
poverty, adequate housing, and sustainable cities. Traditionally, governments
heavily depend on field survey methods to monitor the urban villages, which
however are time-consuming, labor-intensive, and possibly delayed. Thanks to
widely available and timely updated satellite images, recent studies develop
computer vision techniques to detect urban villages efficiently. However,
existing studies either focus on simple urban village image classification or
fail to provide accurate boundary information. To accurately identify urban
village boundaries from satellite images, we harness the power of the vision
foundation model and adapt the Segment Anything Model (SAM) to urban village
segmentation, named UV-SAM. Specifically, UV-SAM first leverages a small-sized
semantic segmentation model to produce mixed prompts for urban villages,
including mask, bounding box, and image representations, which are then fed
into SAM for fine-grained boundary identification. Extensive experimental
results on two datasets in China demonstrate that UV-SAM outperforms existing
baselines, and identification results over multiple years show that both the
number and area of urban villages are decreasing over time, providing deeper
insights into the development trends of urban villages and sheds light on the
vision foundation models for sustainable cities. The dataset and codes of this
study are available at https://github.com/tsinghua-fib-lab/UV-SAM.
Related papers
- UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Region Profiling [26.693692853787756]
Urban region profiling aims to learn a low-dimensional representation of a given urban area.
pretrained models, particularly those reliant on satellite imagery, face dual challenges.
concentrating solely on macro-level patterns from satellite data may introduce bias.
The lack of interpretability in pretrained models limits their utility in providing transparent evidence for urban planning.
arXiv Detail & Related papers (2024-03-25T14:57:18Z) - Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for
Cross-City Semantic Segmentation using High-Resolution Domain Adaptation
Networks [82.82866901799565]
We build a new set of multimodal remote sensing benchmark datasets (including hyperspectral, multispectral, SAR) for the study purpose of the cross-city semantic segmentation task.
Beyond the single city, we propose a high-resolution domain adaptation network, HighDAN, to promote the AI model's generalization ability from the multi-city environments.
HighDAN is capable of retaining the spatially topological structure of the studied urban scene well in a parallel high-to-low resolution fusion fashion.
arXiv Detail & Related papers (2023-09-26T23:55:39Z) - Unified Data Management and Comprehensive Performance Evaluation for
Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark] [78.05103666987655]
This work addresses challenges in accessing and utilizing diverse urban spatial-temporal datasets.
We introduceatomic files, a unified storage format designed for urban spatial-temporal big data, and validate its effectiveness on 40 diverse datasets.
We conduct extensive experiments using diverse models and datasets, establishing a performance leaderboard and identifying promising research directions.
arXiv Detail & Related papers (2023-08-24T16:20:00Z) - A Satellite Imagery Dataset for Long-Term Sustainable Development in
United States Cities [15.862784224905095]
We develop a satellite imagery dataset using deep learning models for five sustainable development indicators.
The proposed dataset covers the 100 most populated U.S. cities and corresponding Census Block Groups from 2014 to 2023.
arXiv Detail & Related papers (2023-08-01T11:40:19Z) - Graph-based Village Level Poverty Identification [52.12744462605759]
The development of the Web infrastructure and its modeling tools provides fresh approaches to identifying poor villages.
By modeling the village connections as a graph through the geographic distance, we show the correlation between village poverty status and its graph topological position.
We propose the first graph-based method to identify poor villages.
arXiv Detail & Related papers (2023-02-14T06:58:40Z) - A Contextual Master-Slave Framework on Urban Region Graph for Urban
Village Detection [68.84486900183853]
We build an urban region graph (URG) to model the urban area in a hierarchically structured way.
Then, we design a novel contextual master-slave framework to effectively detect the urban village from the URG.
The proposed framework can learn to balance the generality and specificity for UV detection in an urban area.
arXiv Detail & Related papers (2022-11-26T18:17:39Z) - Spatial-Temporal Hypergraph Self-Supervised Learning for Crime
Prediction [60.508960752148454]
This work proposes a Spatial-Temporal Hypergraph Self-Supervised Learning framework to tackle the label scarcity issue in crime prediction.
We propose the cross-region hypergraph structure learning to encode region-wise crime dependency under the entire urban space.
We also design the dual-stage self-supervised learning paradigm, to not only jointly capture local- and global-level spatial-temporal crime patterns, but also supplement the sparse crime representation by augmenting region self-discrimination.
arXiv Detail & Related papers (2022-04-18T23:46:01Z) - Effective Urban Region Representation Learning Using Heterogeneous Urban
Graph Attention Network (HUGAT) [0.0]
We propose heterogeneous urban graph attention network (HUGAT) for learning the representations of urban regions.
In our experiments on NYC data, HUGAT outperformed all the state-of-the-art models.
arXiv Detail & Related papers (2022-02-18T04:59:20Z) - Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics.
We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form.
After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z) - A Novel CNN-LSTM-based Approach to Predict Urban Expansion [1.2233362977312943]
Time-series remote sensing data offer a rich source of information that can be used in a wide range of applications.
This paper addresses the challenge of using time-series satellite images to predict urban expansion.
We propose a novel two-step approach based on semantic image segmentation in order to predict urban expansion.
arXiv Detail & Related papers (2021-03-02T12:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.