Natural Language-Driven Global Mapping of Martian Landforms
- URL: http://arxiv.org/abs/2601.15949v1
- Date: Thu, 22 Jan 2026 13:38:13 GMT
- Title: Natural Language-Driven Global Mapping of Martian Landforms
- Authors: Yiran Wang, Shuoyuan Wang, Zhaoran Wei, Jiannan Zhao, Zhonghua Yao, Zejian Xie, Songxin Zhang, Jun Huang, Bingyi Jing, Hongxin Wei,
- Abstract summary: MarScope is a vision-language framework enabling natural language-driven, label-free mapping of Martian landforms.<n>It aligns planetary images and text in a shared semantic space, trained on over 200,000 curated image-text pairs.<n>This framework transforms global geomorphic mapping on Mars by replacing pre-defined classifications with flexible semantic retrieval.
- Score: 25.54158424879149
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Planetary surfaces are typically analyzed using high-level semantic concepts in natural language, yet vast orbital image archives remain organized at the pixel level. This mismatch limits scalable, open-ended exploration of planetary surfaces. Here we present MarScope, a planetary-scale vision-language framework enabling natural language-driven, label-free mapping of Martian landforms. MarScope aligns planetary images and text in a shared semantic space, trained on over 200,000 curated image-text pairs. This framework transforms global geomorphic mapping on Mars by replacing pre-defined classifications with flexible semantic retrieval, enabling arbitrary user queries across the entire planet in 5 seconds with F1 scores up to 0.978. Applications further show that it extends beyond morphological classification to facilitate process-oriented analysis and similarity-based geomorphological mapping at a planetary scale. MarScope establishes a new paradigm where natural language serves as a direct interface for scientific discovery over massive geospatial datasets.
Related papers
- MarsRetrieval: Benchmarking Vision-Language Models for Planetary-Scale Geospatial Retrieval on Mars [21.01507072531742]
We introduce MarsRetrieval, a retrieval benchmark for evaluating vision-language models for Martian geospatial discovery.<n>We propose a unified retrieval-centric protocol to benchmark multimodal embedding architectures.<n>Our evaluation shows MarsRetrieval is challenging, even strong foundation models often fail to capture domain-specific geomorphic distinctions.
arXiv Detail & Related papers (2026-02-15T02:41:56Z) - Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks [7.399515278460871]
A key enabler of progress in other domains has been the availability of standardized benchmarks that support systematic evaluation.<n>We introduce Mars-Bench, the first benchmark designed to systematically evaluate models across a broad range of Mars-related tasks.<n>We provide standardized, ready-to-use datasets and baseline evaluations using models pre-trained on natural images, Earth satellite data, and state-of-the-art vision-language models.
arXiv Detail & Related papers (2025-10-28T02:34:08Z) - Inpainting the Red Planet: Diffusion Models for the Reconstruction of Martian Environments in Virtual Reality [0.0]
Training was conducted on an augmented dataset of 12000 Martian heightmaps derived from NASA's HiRISE survey.<n>A non-homogeneous rescaling strategy captures terrain features across multiple scales before resizing to a fixed 128x128 model resolution.<n>Results show that our approach consistently outperforms these methods in terms of reconstruction accuracy (4-15% on RMSE) and perceptual similarity (29-81% on LPIPS) with the original data.
arXiv Detail & Related papers (2025-10-16T15:02:05Z) - TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z) - Mapping High-level Semantic Regions in Indoor Environments without
Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments.
To enable region identification, the method uses a vision-to-language model to provide scene information for mapping.
By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z) - Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching [60.645802236700035]
Navigating drones through natural language commands remains challenging due to the dearth of accessible multi-modal datasets.
We introduce GeoText-1652, a new natural language-guided geo-localization benchmark.
This dataset is systematically constructed through an interactive human-computer process.
arXiv Detail & Related papers (2023-11-21T17:52:30Z) - Mapping "Brain Terrain" Regions on Mars using Deep Learning [0.0]
A set of critical areas may have seen cycles of ice thawing in the relatively recent past in response to periodic changes in the obliquity of Mars.
In this work, we use convolutional neural networks to detect surface regions containing "Brain Coral" terrain.
We use large images (100-1000 megapixels) from the Mars Reconnaissance Orbiter to search for these landforms at resolutions close to a few tens of centimeters per pixel.
arXiv Detail & Related papers (2023-11-21T02:24:52Z) - Navigation with Large Language Models: Semantic Guesswork as a Heuristic
for Planning [73.0990339667978]
Navigation in unfamiliar environments presents a major challenge for robots.
We use language models to bias exploration of novel real-world environments.
We evaluate LFG in challenging real-world environments and simulated benchmarks.
arXiv Detail & Related papers (2023-10-16T06:21:06Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - Self-Supervised Learning to Guide Scientifically Relevant Categorization
of Martian Terrain Images [1.282755489335386]
We present a self-supervised method that can cluster sedimentary textures in images captured from the Mast camera onboard the Curiosity rover.
We then present a qualitative analysis of these clusters and describe their geologic significance via the creation of a set of granular terrain categories.
arXiv Detail & Related papers (2022-04-21T02:48:40Z) - Towards Robust Monocular Visual Odometry for Flying Robots on Planetary
Missions [49.79068659889639]
Ingenuity, that just landed on Mars, will mark the beginning of a new era of exploration unhindered by traversability.
We present an advanced robust monocular odometry algorithm that uses efficient optical flow tracking.
We also present a novel approach to estimate the current risk of scale drift based on a principal component analysis of the relative translation information matrix.
arXiv Detail & Related papers (2021-09-12T12:52:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.