Related papers: Generalizable Slum Detection from Satellite Imagery with Mixture-of-Experts

Generalizable Slum Detection from Satellite Imagery with Mixture-of-Experts

URL: http://arxiv.org/abs/2511.10300v1
Date: Fri, 14 Nov 2025 01:44:21 GMT
Title: Generalizable Slum Detection from Satellite Imagery with Mixture-of-Experts
Authors: Sumin Lee, Sungwon Park, Jeasurk Yang, Jihee Kim, Meeyoung Cha,
Abstract summary: GRAM is a two-phase test-time adaptation framework that enables robust slum segmentation without requiring labeled data from target regions.<n>We use a million-scale satellite imagery dataset from 12 cities across four continents for source training.<n>During adaptation, prediction consistency across experts filters out unreliable pseudo-labels, allowing the model to generalize effectively to previously unseen regions.
Score: 20.100765943688454
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Satellite-based slum segmentation holds significant promise in generating global estimates of urban poverty. However, the morphological heterogeneity of informal settlements presents a major challenge, hindering the ability of models trained on specific regions to generalize effectively to unseen locations. To address this, we introduce a large-scale high-resolution dataset and propose GRAM (Generalized Region-Aware Mixture-of-Experts), a two-phase test-time adaptation framework that enables robust slum segmentation without requiring labeled data from target regions. We compile a million-scale satellite imagery dataset from 12 cities across four continents for source training. Using this dataset, the model employs a Mixture-of-Experts architecture to capture region-specific slum characteristics while learning universal features through a shared backbone. During adaptation, prediction consistency across experts filters out unreliable pseudo-labels, allowing the model to generalize effectively to previously unseen regions. GRAM outperforms state-of-the-art baselines in low-resource settings such as African cities, offering a scalable and label-efficient solution for global slum mapping and data-driven urban planning.

Related papers

AINet: Anchor Instances Learning for Regional Heterogeneity in Whole Slide Image [61.54860340942449]
We introduce a novel concept of anchor instance (AI), a compact subset of instances that are representative within their regions (local) and discriminative at the bag (global) level.<n>These AIs act as semantic references to guide interactions across regions, correcting non-discriminative patterns while preserving regional diversity.<n>We develop a concise yet effective framework, AINet, which employs a simple predictor and surpasses state-of-the-art methods with substantially fewer FLOPs and parameters.
arXiv Detail & Related papers (2026-02-21T09:36:27Z)
Urban-R1: Reinforced MLLMs Mitigate Geospatial Biases for Urban General Intelligence [64.36291202666212]
Urban General Intelligence (UGI) refers to AI systems that can understand and reason about complex urban environments.<n>Recent studies have built urban foundation models using supervised fine-tuning (SFT) of LLMs and MLLMs.<n>We propose Urban-R1, a reinforcement learning-based post-training framework that aligns MLLMs with the objectives of UGI.
arXiv Detail & Related papers (2025-10-18T15:59:09Z)
DeepC4: Deep Conditional Census-Constrained Clustering for Large-scale Multitask Spatial Disaggregation of Urban Morphology [0.7237068561453082]
We present a novel deep learning-based spatial disaggregation approach that incorporates local census statistics as cluster-level constraints.<n>Our work has offered a new deep learning-based mapping technique towards a spatial auditing of our existing coarse-grained derived information at large scales.
arXiv Detail & Related papers (2025-07-30T10:25:39Z)
Synthetic Data Matters: Re-training with Geo-typical Synthetic Labels for Building Detection [13.550020274133866]
We propose re-training models at test time using synthetic data tailored to the target region's city layout.<n>This method generates geo-typical synthetic data that closely replicates the urban structure of a target area.<n>Experiments demonstrate significant performance enhancements, with median improvements of up to 12%, depending on the domain gap.
arXiv Detail & Related papers (2025-07-22T14:53:13Z)
TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z)
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z)
Geographical Context Matters: Bridging Fine and Coarse Spatial Information to Enhance Continental Land Cover Mapping [2.9212099078191756]
BRIDGE-LC is a novel deep learning framework that integrates multi-scale geospatial information into the land cover classification process.<n>Our results demonstrate that integrating geospatial information improves land cover mapping performance.
arXiv Detail & Related papers (2025-04-16T17:42:46Z)
CV-Cities: Advancing Cross-View Geo-Localization in Global Cities [3.074201632920997]
Cross-view geo-localization (CVGL) involves matching and retrieving satellite images to determine the geographic location of a ground image. This task faces significant challenges due to substantial viewpoint discrepancies, the complexity of localization scenarios, and the need for global localization. We propose a novel CVGL framework that integrates the foundational model DINOv2 with an advanced feature mixer.
arXiv Detail & Related papers (2024-11-19T11:41:22Z)
Cross Pseudo Supervision Framework for Sparsely Labelled Geospatial Images [0.0]
Land Use Land Cover (LULC) mapping is a vital tool for urban and resource planning. This study introduces a semi-supervised segmentation model for LULC prediction using high-resolution satellite images. We propose a modified Cross Pseudo Supervision framework to train image segmentation models on sparsely labelled data.
arXiv Detail & Related papers (2024-08-05T11:14:23Z)
Enhanced Urban Region Profiling with Adversarial Self-Supervised Learning for Robust Forecasting and Security [12.8405655328298]
Existing methods often struggle with issues such as noise, data incompleteness, and security vulnerabilities.<n>This paper proposes a novel framework, Enhanced Urban Region Profiling with Adversarial Self-Supervised Learning (EUPAS)<n>EUPAS ensures robust performance across various forecasting tasks such as crime prediction, check-in prediction, and land use classification.
arXiv Detail & Related papers (2024-02-02T06:06:45Z)
Recognize Any Regions [55.76437190434433]
RegionSpot integrates position-aware localization knowledge from a localization foundation model with semantic information from a ViL model.<n>Experiments in open-world object recognition show that our RegionSpot achieves significant performance gain over prior alternatives.
arXiv Detail & Related papers (2023-11-02T16:31:49Z)
Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for Cross-City Semantic Segmentation using High-Resolution Domain Adaptation Networks [82.82866901799565]
We build a new set of multimodal remote sensing benchmark datasets (including hyperspectral, multispectral, SAR) for the study purpose of the cross-city semantic segmentation task. Beyond the single city, we propose a high-resolution domain adaptation network, HighDAN, to promote the AI model's generalization ability from the multi-city environments. HighDAN is capable of retaining the spatially topological structure of the studied urban scene well in a parallel high-to-low resolution fusion fashion.
arXiv Detail & Related papers (2023-09-26T23:55:39Z)
Activation Regression for Continuous Domain Generalization with Applications to Crop Classification [48.795866501365694]
Geographic variance in satellite imagery impacts the ability of machine learning models to generalise to new regions. We model geographic generalisation in medium resolution Landsat-8 satellite imagery as a continuous domain adaptation problem. We develop a dataset spatially distributed across the entire continental United States.
arXiv Detail & Related papers (2022-04-14T15:41:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.