Related papers: UV-SAM: Adapting Segment Anything Model for Urban Village Identification

UV-SAM: Adapting Segment Anything Model for Urban Village Identification

URL: http://arxiv.org/abs/2401.08083v2
Date: Thu, 1 Feb 2024 08:05:26 GMT
Title: UV-SAM: Adapting Segment Anything Model for Urban Village Identification
Authors: Xin Zhang, Yu Liu, Yuming Lin, Qingmin Liao, Yong Li
Abstract summary: Governments heavily depend on field survey methods to monitor the urban villages. To accurately identify urban village boundaries from satellite images, we adapt the Segment Anything Model (SAM) to urban village segmentation, named UV-SAM. UV-SAM first leverages a small-sized semantic segmentation model to produce mixed prompts for urban villages, including mask, bounding box, and image representations, which are then fed into SAM for fine-grained boundary identification.
Score: 25.286722125746902
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Urban villages, defined as informal residential areas in or around urban centers, are characterized by inadequate infrastructures and poor living conditions, closely related to the Sustainable Development Goals (SDGs) on poverty, adequate housing, and sustainable cities. Traditionally, governments heavily depend on field survey methods to monitor the urban villages, which however are time-consuming, labor-intensive, and possibly delayed. Thanks to widely available and timely updated satellite images, recent studies develop computer vision techniques to detect urban villages efficiently. However, existing studies either focus on simple urban village image classification or fail to provide accurate boundary information. To accurately identify urban village boundaries from satellite images, we harness the power of the vision foundation model and adapt the Segment Anything Model (SAM) to urban village segmentation, named UV-SAM. Specifically, UV-SAM first leverages a small-sized semantic segmentation model to produce mixed prompts for urban villages, including mask, bounding box, and image representations, which are then fed into SAM for fine-grained boundary identification. Extensive experimental results on two datasets in China demonstrate that UV-SAM outperforms existing baselines, and identification results over multiple years show that both the number and area of urban villages are decreasing over time, providing deeper insights into the development trends of urban villages and sheds light on the vision foundation models for sustainable cities. The dataset and codes of this study are available at https://github.com/tsinghua-fib-lab/UV-SAM.

Related papers

Urban delineation through the lens of commute networks: Leveraging graph embeddings to distinguish socioeconomic groups in cities [0.0]
We propose using commute networks sourced from the census for the purpose of urban delineation.<n>We derive low-dimensional representations of granular urban areas using Graph Neural Network architecture.<n>Experiments across the U.S. demonstrate the effectiveness of network embeddings in capturing significant socioeconomic disparities.
arXiv Detail & Related papers (2025-07-15T07:47:03Z)
Urban Forms Across Continents: A Data-Driven Comparison of Lausanne and Philadelphia [7.693465097015469]
This study presents a data-driven framework to identify and compare urban typologies across geographically and culturally distinct cities.<n>We extracted multidimensional features related to topography, multimodality, green spaces, and points of interest for the cities of Lausanne, Switzerland, and Philadelphia, USA.<n>The results reveal coherent and interpretable urban typologies within each city, with some cluster types emerging across both cities despite their differences in scale, density, and cultural context.
arXiv Detail & Related papers (2025-05-05T18:13:22Z)
Mapping Urban Villages in China: Progress and Challenges [20.708176590993975]
Shift toward high-quality urbanization has brought increased attention to the issue of "urban villages" There is a lack of available geospatial data on urban villages, making it crucial to prioritize urban village mapping. Future research can complement and further the current research in order to achieve large-area mapping across the whole nation.
arXiv Detail & Related papers (2025-03-18T12:13:55Z)
AerialGo: Walking-through City View Generation from Aerial Perspectives [48.53976414257845]
AerialGo is a framework that generates realistic walking-through city views from aerial images. By conditioning ground-view synthesis on accessible aerial data, AerialGo bypasses the privacy risks inherent in ground-level imagery. Experiments show that AerialGo significantly enhances ground-level realism and structural coherence.
arXiv Detail & Related papers (2024-11-29T08:14:07Z)
UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Region Profiling [26.693692853787756]
Urban region profiling aims to learn a low-dimensional representation of a given urban area. pretrained models, particularly those reliant on satellite imagery, face dual challenges. concentrating solely on macro-level patterns from satellite data may introduce bias. The lack of interpretability in pretrained models limits their utility in providing transparent evidence for urban planning.
arXiv Detail & Related papers (2024-03-25T14:57:18Z)
Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for Cross-City Semantic Segmentation using High-Resolution Domain Adaptation Networks [82.82866901799565]
We build a new set of multimodal remote sensing benchmark datasets (including hyperspectral, multispectral, SAR) for the study purpose of the cross-city semantic segmentation task. Beyond the single city, we propose a high-resolution domain adaptation network, HighDAN, to promote the AI model's generalization ability from the multi-city environments. HighDAN is capable of retaining the spatially topological structure of the studied urban scene well in a parallel high-to-low resolution fusion fashion.
arXiv Detail & Related papers (2023-09-26T23:55:39Z)
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark] [78.05103666987655]
This work addresses challenges in accessing and utilizing diverse urban spatial-temporal datasets. We introduceatomic files, a unified storage format designed for urban spatial-temporal big data, and validate its effectiveness on 40 diverse datasets. We conduct extensive experiments using diverse models and datasets, establishing a performance leaderboard and identifying promising research directions.
arXiv Detail & Related papers (2023-08-24T16:20:00Z)
A Satellite Imagery Dataset for Long-Term Sustainable Development in United States Cities [15.862784224905095]
We develop a satellite imagery dataset using deep learning models for five sustainable development indicators. The proposed dataset covers the 100 most populated U.S. cities and corresponding Census Block Groups from 2014 to 2023.
arXiv Detail & Related papers (2023-08-01T11:40:19Z)
Graph-based Village Level Poverty Identification [52.12744462605759]
The development of the Web infrastructure and its modeling tools provides fresh approaches to identifying poor villages. By modeling the village connections as a graph through the geographic distance, we show the correlation between village poverty status and its graph topological position. We propose the first graph-based method to identify poor villages.
arXiv Detail & Related papers (2023-02-14T06:58:40Z)
A Contextual Master-Slave Framework on Urban Region Graph for Urban Village Detection [68.84486900183853]
We build an urban region graph (URG) to model the urban area in a hierarchically structured way. Then, we design a novel contextual master-slave framework to effectively detect the urban village from the URG. The proposed framework can learn to balance the generality and specificity for UV detection in an urban area.
arXiv Detail & Related papers (2022-11-26T18:17:39Z)
Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction [60.508960752148454]
This work proposes a Spatial-Temporal Hypergraph Self-Supervised Learning framework to tackle the label scarcity issue in crime prediction. We propose the cross-region hypergraph structure learning to encode region-wise crime dependency under the entire urban space. We also design the dual-stage self-supervised learning paradigm, to not only jointly capture local- and global-level spatial-temporal crime patterns, but also supplement the sparse crime representation by augmenting region self-discrimination.
arXiv Detail & Related papers (2022-04-18T23:46:01Z)
Effective Urban Region Representation Learning Using Heterogeneous Urban Graph Attention Network (HUGAT) [0.0]
We propose heterogeneous urban graph attention network (HUGAT) for learning the representations of urban regions. In our experiments on NYC data, HUGAT outperformed all the state-of-the-art models.
arXiv Detail & Related papers (2022-02-18T04:59:20Z)
Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics. We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form. After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z)
A Novel CNN-LSTM-based Approach to Predict Urban Expansion [1.2233362977312943]
Time-series remote sensing data offer a rich source of information that can be used in a wide range of applications. This paper addresses the challenge of using time-series satellite images to predict urban expansion. We propose a novel two-step approach based on semantic image segmentation in order to predict urban expansion.
arXiv Detail & Related papers (2021-03-02T12:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.