Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
- URL: http://arxiv.org/abs/2511.23029v1
- Date: Fri, 28 Nov 2025 09:52:44 GMT
- Title: Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
- Authors: Tai Inui, Alexander Matsumura, Edgar Simo-Serra,
- Abstract summary: We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps.<n>The core mechanism is multi-scale content aggregation (MCA): DEM features are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency.<n>To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-appearance captions.
- Score: 48.82552523546255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale terrain generation remains a labor-intensive task in computer graphics. We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps while strictly adhering to a supplied Digital Elevation Map (DEM). The core mechanism is multi-scale content aggregation (MCA): DEM features from a pretrained encoder are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency. Compared with a non-MCA baseline, MCA markedly improves visual fidelity and strengthens height-appearance coupling (FID $\downarrow$ 49.16%, LPIPS $\downarrow$ 32.33%, $Δ$dCor $\downarrow$ to 0.0016). To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-language captions that describe visible land cover. We position Geodiffussr as a strong baseline and step toward controllable 2.5D landscape generation for coarse-scale ideation and previz, complementary to physically based terrain and ecosystem simulators.
Related papers
- (MGS)$^2$-Net: Unifying Micro-Geometric Scale and Macro-Geometric Structure for Cross-View Geo-Localization [6.842471990535349]
Cross-view geo-localization (CVGL) is pivotal for UAV navigation but remains brittle under the drastic geometric misalignment between oblique aerial views and orthographic satellite references.<n>We propose (MGS)$2$, a geometry-grounded framework to bridge this gap.<n>Experiments demonstrate that (MGS)$2$ state-of-the-art performance, recording a Recall@1 of 97.5% on University-1652 and 97.02% on SUES-200.
arXiv Detail & Related papers (2026-02-11T10:03:31Z) - Exploring the Underwater World Segmentation without Extra Training [55.291219073365546]
We introduce textbfAquaOV255, the first large-scale and fine-grained underwater segmentation dataset.<n>We also present textbfEarth2Ocean, a training-free OV segmentation framework.
arXiv Detail & Related papers (2025-11-11T07:22:56Z) - EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion [23.3834795181211]
We introduce Aerial-Earth3D, the largest 3D aerial dataset to date, consisting of 50k curated scenes (each measuring 600m x 600m) captured across the U.S. mainland.<n>Each scene provides pose-annotated multi-view images, depth maps, normals, semantic segmentation, and camera poses, with explicit quality control to ensure terrain diversity.<n>We propose EarthCrafter, a tailored framework for large-scale 3D Earth generation via sparse-decoupled latent diffusion.
arXiv Detail & Related papers (2025-07-22T12:46:48Z) - EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - LMSeg: An end-to-end geometric message-passing network on barycentric dual graphs for large-scale landscape mesh segmentation [5.698076346359676]
We introduce the BudjBim Wall dataset, a large-scale annotated mesh dataset from the UNESCO World Heritage-listed Budj Bim cultural landscape in Victoria, Australia.<n>We propose LMSeg, a deep graph message-passing network for semantic segmentation of large-scale meshes.<n>Experiments on three large-scale benchmarks (SUM, H3D, and BBW) show that LMSeg achieves 75.1% mIoU on SUM, 78.4% O.A. on H3D, and 62.4% mIoU on BBW, using only 2.4M lightweight parameters.
arXiv Detail & Related papers (2024-07-05T07:55:06Z) - Context and Geometry Aware Voxel Transformer for Semantic Scene Completion [7.147020285382786]
Vision-based Semantic Scene Completion (SSC) has gained much attention due to its widespread applications in various 3D perception tasks.
Existing sparse-to-dense approaches typically employ shared context-independent queries across various input images.
We introduce a neural network named CGFormer to achieve semantic scene completion.
arXiv Detail & Related papers (2024-05-22T14:16:30Z) - Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels [4.833320222969612]
Large-scale high-resolution (HR) land-cover mapping is a vital task to survey the Earth's surface and resolve many challenges facing humanity.
We propose an efficient, weakly supervised framework (Paraformer) to guide large-scale HR land-cover mapping.
arXiv Detail & Related papers (2024-03-05T08:02:00Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - TerrainMesh: Metric-Semantic Terrain Reconstruction from Aerial Images
Using Joint 2D-3D Learning [20.81202315793742]
This paper develops a joint 2D-3D learning approach to reconstruct a local metric-semantic mesh at each camera maintained by a visual odometry algorithm.
The mesh can be assembled into a global environment model to capture the terrain topology and semantics during online operation.
arXiv Detail & Related papers (2022-04-23T05:18:39Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.