MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data
- URL: http://arxiv.org/abs/2504.07210v2
- Date: Mon, 14 Apr 2025 17:25:41 GMT
- Title: MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data
- Authors: Paul Borne--Pons, Mikolaj Czerkawski, Rosalie Martin, Romain Rouffet,
- Abstract summary: We present MESA - a novel data-centric alternative to procedural terrain modeling.<n>MESA generates high-quality terrain samples from text descriptions using global remote sensing data.<n>The model's capabilities are demonstrated through extensive experiments, highlighting its ability to generate realistic and diverse terrain landscapes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Terrain modeling has traditionally relied on procedural techniques, which often require extensive domain expertise and handcrafted rules. In this paper, we present MESA - a novel data-centric alternative by training a diffusion model on global remote sensing data. This approach leverages large-scale geospatial information to generate high-quality terrain samples from text descriptions, showcasing a flexible and scalable solution for terrain generation. The model's capabilities are demonstrated through extensive experiments, highlighting its ability to generate realistic and diverse terrain landscapes. The dataset produced to support this work, the Major TOM Core-DEM extension dataset, is released openly as a comprehensive resource for global terrain data. The results suggest that data-driven models, trained on remote sensing data, can provide a powerful tool for realistic terrain modeling and generation.
Related papers
- TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data [3.674991996196602]
We introduce TerraMesh, a new globally diverse, multimodal dataset combining optical, radar, elevation, and land-cover modalities in a single format.
We provide detailed data processing steps, comprehensive statistics, and empirical evidence demonstrating improved model performance when pre-trained on TerraMesh.
The dataset will be made publicly available with a permissive license.
arXiv Detail & Related papers (2025-04-15T13:20:35Z) - TerraMind: Large-Scale Generative Multimodality for Earth Observation [3.5472166810202457]
We present TerraMind, the first any-to-any generative, multimodal foundation model for Earth observation.
Unlike other multimodal models, TerraMind is pretrained on dual-scale representations combining both token-level and pixel-level data.
arXiv Detail & Related papers (2025-04-15T13:17:39Z) - OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence [51.0456395687016]
multimodal large language models (LLMs) have opened new frontiers in artificial intelligence.<n>We propose a MLLM (OmniGeo) tailored to geospatial applications.<n>By combining the strengths of natural language understanding and spatial reasoning, our model enhances the ability of instruction following and the accuracy of GeoAI systems.
arXiv Detail & Related papers (2025-03-20T16:45:48Z) - EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision [72.84868704100595]
This paper presents a dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks.<n>The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic.<n>Accompanying the dataset is EarthMAE, a tailored Masked Autoencoder developed to tackle the distinct challenges of remote sensing data.
arXiv Detail & Related papers (2025-01-14T13:42:22Z) - ImplicitTerrain: a Continuous Surface Model for Terrain Data Analysis [14.013976303831313]
ImplicitTerrain is an implicit neural representation (INR) approach for modeling high-resolution terrain continuously and differentiably.
Our experiments demonstrate superior surface fitting accuracy, effective topological feature retrieval, and various topographical feature extraction.
arXiv Detail & Related papers (2024-05-31T23:05:34Z) - UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction [93.77809355002591]
We introduce UniTraj, a comprehensive framework that unifies various datasets, models, and evaluation criteria.
We conduct extensive experiments and find that model performance significantly drops when transferred to other datasets.
We provide insights into dataset characteristics to explain these findings.
arXiv Detail & Related papers (2024-03-22T10:36:50Z) - T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified
Visual Modalities [69.16656086708291]
Diffusion Probabilistic Field (DPF) models the distribution of continuous functions defined over metric spaces.
We propose a new model comprising of a view-wise sampling algorithm to focus on local structure learning.
The model can be scaled to generate high-resolution data while unifying multiple modalities.
arXiv Detail & Related papers (2023-05-24T03:32:03Z) - Sky-image-based solar forecasting using deep learning with
multi-location data: training models locally, globally or via transfer
learning? [0.0]
One of the biggest challenges for training deep learning models is the availability of labeled datasets.
With more and more sky image datasets open sourced in recent years, the development of accurate and reliable solar forecasting methods has seen a huge growth in potential.
arXiv Detail & Related papers (2022-11-03T19:25:28Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - Deep Generative Framework for Interactive 3D Terrain Authoring and
Manipulation [4.202216894379241]
We propose a novel realistic terrain authoring framework powered by a combination of VAE and generative conditional GAN model.
Our framework is an example-based method that attempts to overcome the limitations of existing methods by learning a latent space from a real-world terrain dataset.
We also developed an interactive tool, that lets the user generate diverse terrains with minimalist inputs.
arXiv Detail & Related papers (2022-01-07T08:58:01Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.