Related papers: Geodiffussr: Generative Terrain Texturing with Elevation Fidelity

Geodiffussr: Generative Terrain Texturing with Elevation Fidelity

URL: http://arxiv.org/abs/2511.23029v1
Date: Fri, 28 Nov 2025 09:52:44 GMT
Title: Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
Authors: Tai Inui, Alexander Matsumura, Edgar Simo-Serra,
Abstract summary: We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps.<n>The core mechanism is multi-scale content aggregation (MCA): DEM features are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency.<n>To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-appearance captions.
Score: 48.82552523546255
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large-scale terrain generation remains a labor-intensive task in computer graphics. We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps while strictly adhering to a supplied Digital Elevation Map (DEM). The core mechanism is multi-scale content aggregation (MCA): DEM features from a pretrained encoder are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency. Compared with a non-MCA baseline, MCA markedly improves visual fidelity and strengthens height-appearance coupling (FID $\downarrow$ 49.16%, LPIPS $\downarrow$ 32.33%, $Δ$dCor $\downarrow$ to 0.0016). To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-language captions that describe visible land cover. We position Geodiffussr as a strong baseline and step toward controllable 2.5D landscape generation for coarse-scale ideation and previz, complementary to physically based terrain and ecosystem simulators.

Related papers

(MGS)$^2$-Net: Unifying Micro-Geometric Scale and Macro-Geometric Structure for Cross-View Geo-Localization [6.842471990535349]
Cross-view geo-localization (CVGL) is pivotal for UAV navigation but remains brittle under the drastic geometric misalignment between oblique aerial views and orthographic satellite references.<n>We propose (MGS)$2$, a geometry-grounded framework to bridge this gap.<n>Experiments demonstrate that (MGS)$2$ state-of-the-art performance, recording a Recall@1 of 97.5% on University-1652 and 97.02% on SUES-200.
arXiv Detail & Related papers (2026-02-11T10:03:31Z)
Exploring the Underwater World Segmentation without Extra Training [55.291219073365546]
We introduce textbfAquaOV255, the first large-scale and fine-grained underwater segmentation dataset.<n>We also present textbfEarth2Ocean, a training-free OV segmentation framework.
arXiv Detail & Related papers (2025-11-11T07:22:56Z)
EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion [23.3834795181211]
We introduce Aerial-Earth3D, the largest 3D aerial dataset to date, consisting of 50k curated scenes (each measuring 600m x 600m) captured across the U.S. mainland.<n>Each scene provides pose-annotated multi-view images, depth maps, normals, semantic segmentation, and camera poses, with explicit quality control to ensure terrain diversity.<n>We propose EarthCrafter, a tailored framework for large-scale 3D Earth generation via sparse-decoupled latent diffusion.
arXiv Detail & Related papers (2025-07-22T12:46:48Z)
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
LMSeg: An end-to-end geometric message-passing network on barycentric dual graphs for large-scale landscape mesh segmentation [5.698076346359676]
We introduce the BudjBim Wall dataset, a large-scale annotated mesh dataset from the UNESCO World Heritage-listed Budj Bim cultural landscape in Victoria, Australia.<n>We propose LMSeg, a deep graph message-passing network for semantic segmentation of large-scale meshes.<n>Experiments on three large-scale benchmarks (SUM, H3D, and BBW) show that LMSeg achieves 75.1% mIoU on SUM, 78.4% O.A. on H3D, and 62.4% mIoU on BBW, using only 2.4M lightweight parameters.
arXiv Detail & Related papers (2024-07-05T07:55:06Z)
Context and Geometry Aware Voxel Transformer for Semantic Scene Completion [7.147020285382786]
Vision-based Semantic Scene Completion (SSC) has gained much attention due to its widespread applications in various 3D perception tasks. Existing sparse-to-dense approaches typically employ shared context-independent queries across various input images. We introduce a neural network named CGFormer to achieve semantic scene completion.
arXiv Detail & Related papers (2024-05-22T14:16:30Z)
Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels [4.833320222969612]
Large-scale high-resolution (HR) land-cover mapping is a vital task to survey the Earth's surface and resolve many challenges facing humanity. We propose an efficient, weakly supervised framework (Paraformer) to guide large-scale HR land-cover mapping.
arXiv Detail & Related papers (2024-03-05T08:02:00Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
TerrainMesh: Metric-Semantic Terrain Reconstruction from Aerial Images Using Joint 2D-3D Learning [20.81202315793742]
This paper develops a joint 2D-3D learning approach to reconstruct a local metric-semantic mesh at each camera maintained by a visual odometry algorithm. The mesh can be assembled into a global environment model to capture the terrain topology and semantics during online operation.
arXiv Detail & Related papers (2022-04-23T05:18:39Z)
TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments. TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z)
OmniSLAM: Omnidirectional Localization and Dense Mapping for Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras. For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation. We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.