Related papers: C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features

C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features

URL: http://arxiv.org/abs/2502.04991v1
Date: Fri, 07 Feb 2025 15:11:31 GMT
Title: C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features
Authors: Chenxing Sun, Yongyang Xu, Xuwei Xu, Xixi Fan, Jing Bai, Xiechun Lu, Zhanlong Chen,
Abstract summary: This paper presents C2GM, a novel framework for generating multi-scale tile maps from remote sensing images.<n>We implement a conditional feature fusion encoder to extract object priors from remote sensing images and cascade reference double branch input.<n>C2GM consistently achieves the state-of-the-art (SOTA) performance on all metrics.
Score: 2.414525855161937
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-scale maps are essential representations of surveying and cartographic results, serving as fundamental components of geographic services. Current image generation networks can quickly produce map tiles from remote-sensing images. However, generative models designed for natural images often focus on texture features, neglecting the unique characteristics of remote-sensing features and the scale attributes of tile maps. This limitation in generative models impairs the accurate representation of geographic information, and the quality of tile map generation still needs improvement. Diffusion models have demonstrated remarkable success in various image generation tasks, highlighting their potential to address this challenge. This paper presents C2GM, a novel framework for generating multi-scale tile maps through conditional guided diffusion and multi-scale cascade generation. Specifically, we implement a conditional feature fusion encoder to extract object priors from remote sensing images and cascade reference double branch input, ensuring an accurate representation of complex features. Low-level generated tiles act as constraints for high-level map generation, enhancing visual continuity. Moreover, we incorporate map scale modality information using CLIP to simulate the relationship between map scale and cartographic generalization in tile maps. Extensive experimental evaluations demonstrate that C2GM consistently achieves the state-of-the-art (SOTA) performance on all metrics, facilitating the rapid and effective generation of multi-scale large-format maps for emergency response and remote mapping applications.

Related papers

EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z)
Map Feature Perception Metric for Map Generation Quality Assessment and Loss Optimization [2.311323886036968]
This study introduces a novel Map Feature Metric designed to evaluate global characteristics and spatial congruence between synthesized and target maps.<n>Our approach extracts elemental-level deep features that comprehensively encode cartographic structural integrity and topological relationships.
arXiv Detail & Related papers (2025-03-30T09:07:09Z)
Panoptic Diffusion Models: co-generation of images and segmentation maps [7.573297026523597]
We present Panoptic Diffusion Model (PDM), the first model designed to generate both images and panoptic segmentation maps concurrently.<n>PDM bridges the gap between image and text by constructing segmentation layouts that provide detailed, built-in guidance throughout the generation process.
arXiv Detail & Related papers (2024-12-04T00:42:15Z)
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction [75.93907511203317]
We propose MGMapNet (Multi-Granularity Map Network) to model map element with a multi-granularity representation. The proposed MGMapNet achieves state-of-the-art performance, surpassing MapTRv2 by 5.3 mAP on nuScenes and 4.4 mAP on Argoverse2 respectively.
arXiv Detail & Related papers (2024-10-10T09:05:23Z)
GenMapping: Unleashing the Potential of Inverse Perspective Mapping for Robust Online HD Map Construction [20.1127163541618]
We have designed a universal map generation framework, GenMapping. The framework is established with a triadic synergy architecture, including principal and dual auxiliary branches. A thorough array of experimental results shows that the proposed model surpasses current state-of-the-art methods in both semantic mapping and vectorized mapping, while also maintaining a rapid inference speed.
arXiv Detail & Related papers (2024-09-13T10:15:28Z)
HPix: Generating Vector Maps from Satellite Images [0.0]
We propose a novel method called HPix, which utilizes modified Generative Adversarial Networks (GANs) to generate vector tile map from satellite images. Through empirical evaluations, our proposed approach showcases its effectiveness in producing highly accurate and visually captivating vector tile maps. We further extend our study's application to include mapping of road intersections and building footprints cluster based on their area.
arXiv Detail & Related papers (2024-07-18T16:54:02Z)
DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model [15.803614800117781]
We propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks. By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced. Our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts.
arXiv Detail & Related papers (2024-05-03T11:16:27Z)
CartoMark: a benchmark dataset for map pattern recognition and 1 map content retrieval with machine intelligence [9.652629004863364]
We develop a large-scale benchmark dataset for map text annotation recognition, map scene classification, map super-resolution reconstruction, and map style transferring. These well-labelled datasets would facilitate the state-of-the-art machine intelligence technologies to conduct map feature detection, map pattern recognition and map content retrieval.
arXiv Detail & Related papers (2023-12-14T01:54:38Z)
LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes [2.822816116516042]
Large-scale semantic mapping is crucial for outdoor autonomous agents to fulfill high-level tasks such as planning and navigation. This paper proposes a novel method for large-scale 3D semantic reconstruction through implicit representations from posed LiDAR measurements alone.
arXiv Detail & Related papers (2023-11-04T03:55:38Z)
SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation [68.42476385214785]
We propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance. SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to previous works. We also propose the Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA) mechanisms.
arXiv Detail & Related papers (2023-08-20T04:09:12Z)
DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature. It uses a single-scale feature map and global cross-attention calculations without specific locality constraints. We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z)
CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)
Spatial-spectral Hyperspectral Image Classification via Multiple Random Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE) Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region. Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z)
Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems. We present three novel scenarios for localization and mapping which require the continuous update of feature representations. Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z)
Multi-Level Graph Convolutional Network with Automatic Graph Learning for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification. By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions. Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z)
Learning Lane Graph Representations for Motion Forecasting [92.88572392790623]
We construct a lane graph from raw map data to preserve the map structure. We exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor. Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.
arXiv Detail & Related papers (2020-07-27T17:59:49Z)
SMAPGAN: Generative Adversarial Network Based Semi-Supervised Styled Map Tiles Generating Method [0.0]
Traditional online map tiles, widely used on the Internet such as Google Map and Baidu Map, are rendered from vector data. We propose a semi-supervised Generation of styled map Tiles based on Generative Adversarial Network (SMAPGAN) model. Experimental results present that SMAPGAN outperforms state-of-the-art (SOTA) works according to mean squared error, structural similarity index, and ESSI.
arXiv Detail & Related papers (2020-01-21T04:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.