MapSAM2: Adapting SAM2 for Automatic Segmentation of Historical Map Images and Time Series
- URL: http://arxiv.org/abs/2510.27547v1
- Date: Fri, 31 Oct 2025 15:25:40 GMT
- Title: MapSAM2: Adapting SAM2 for Automatic Segmentation of Historical Map Images and Time Series
- Authors: Xue Xia, Randall Balestriero, Tao Zhang, Yixin Zhou, Andrew Ding, Dev Saini, Lorenz Hurni,
- Abstract summary: We present MapSAM2, a unified framework for automatically segmenting both historical map images and time series.<n>For images, we process a set of tiles as a video, enabling the memory attention mechanism to incorporate contextual cues from similar tiles.<n>For time series, we introduce the annotated Siegfried Building Time Series dataset, to reduce annotation costs.
- Score: 20.190148795374153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Historical maps are unique and valuable archives that document geographic features across different time periods. However, automated analysis of historical map images remains a significant challenge due to their wide stylistic variability and the scarcity of annotated training data. Constructing linked spatio-temporal datasets from historical map time series is even more time-consuming and labor-intensive, as it requires synthesizing information from multiple maps. Such datasets are essential for applications such as dating buildings, analyzing the development of road networks and settlements, studying environmental changes etc. We present MapSAM2, a unified framework for automatically segmenting both historical map images and time series. Built on a visual foundation model, MapSAM2 adapts to diverse segmentation tasks with few-shot fine-tuning. Our key innovation is to treat both historical map images and time series as videos. For images, we process a set of tiles as a video, enabling the memory attention mechanism to incorporate contextual cues from similar tiles, leading to improved geometric accuracy, particularly for areal features. For time series, we introduce the annotated Siegfried Building Time Series Dataset and, to reduce annotation costs, propose generating pseudo time series from single-year maps by simulating common temporal transformations. Experimental results show that MapSAM2 learns temporal associations effectively and can accurately segment and link buildings in time series under limited supervision or using pseudo videos. We will release both our dataset and code to support future research.
Related papers
- Generalizable Multiscale Segmentation of Heterogeneous Map Collections [0.0]
Historical map collections are highly diverse in style, scale, and geographic focus.<n>Most work in map recognition focuses on specialist models tailored to homogeneous map series.<n>We introduce Semap, a new open benchmark dataset comprising 1,439 manually annotated patches designed to reflect the variety of historical map documents.<n>We present a segmentation framework that combines procedural data synthesis with multiscale integration to improve robustness and transferability.
arXiv Detail & Related papers (2026-03-05T10:40:01Z) - HisTrackMap: Global Vectorized High-Definition Map Construction via History Map Tracking [24.21124150354725]
We propose a novel end-to-end tracking framework global map construction by temporally tracking map elements' historical trajectories.<n>We introduce a Map-Trajectory Prior Fusion module within this tracking framework, leveraging historical priors for tracked instances to improve temporal smoothness and continuity.<n>Substantial experiments on the nuScenes and Argoverse2 datasets demonstrate that the proposed method outperforms state-of-the-art (SOTA) methods in both single-frame and temporal metrics.
arXiv Detail & Related papers (2025-03-10T10:44:43Z) - Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative [65.84249211767921]
Texts as Time Series (TaTS) can be plugged into any existing numerical-only time series models.<n>We show that TaTS can enhance predictive performance without modifying model architectures.
arXiv Detail & Related papers (2025-02-13T03:43:27Z) - Semantic Segmentation for Sequential Historical Maps by Learning from Only One Map [0.4915744683251151]
We propose an automated approach to digitization using deep-learning-based semantic segmentation.<n>A key challenge in this process is the lack of ground-truth annotations required for training deep neural networks.<n>We introduce a weakly-supervised age-tracing strategy for model fine-tuning.
arXiv Detail & Related papers (2025-01-03T14:55:22Z) - Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps [16.35356981558991]
We propose a novel approach that combines segmentation and association of geographic entities in historical maps using video instance segmentation (VIS)
To mitigate this challenge, we explore self-supervised learning (SSL) techniques to enhance VIS performance on historical maps.
arXiv Detail & Related papers (2024-11-26T13:31:51Z) - MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps [6.414068793245697]
We introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for historical map segmentation tasks.
Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder.
We develop an automatic prompt generation process, eliminating the need for manual input.
arXiv Detail & Related papers (2024-11-11T13:18:45Z) - Deep Time Series Models: A Comprehensive Survey and Benchmark [60.742416934632416]
Time series present unique challenges due to their intricate and dynamic nature.<n>Recent years have witnessed remarkable breakthroughs in the time series community.<n>We release Time Series Library (TSLib) as a fair benchmark of deep time series models for diverse analysis tasks.
arXiv Detail & Related papers (2024-07-18T08:31:55Z) - Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers [55.475142494272724]
Time series prediction is crucial for understanding and forecasting complex dynamics in various domains.
We introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions.
The model consistently delivers state-of-the-art performance across various real-world datasets.
arXiv Detail & Related papers (2024-05-22T16:41:21Z) - Cross-attention Spatio-temporal Context Transformer for Semantic
Segmentation of Historical Maps [18.016789471815855]
Historical maps provide useful-temporal information on the Earth's surface before modern earth observation techniques came into being.
Aleatoric uncertainty known as data-dependent uncertainty inherent in the drawing/fading defects of the original map sheets.
We propose a U--based network that fuses maps that aggregating information at a larger range as well as through a temporal sequence.
arXiv Detail & Related papers (2023-10-19T09:49:58Z) - Novel Features for Time Series Analysis: A Complex Networks Approach [62.997667081978825]
Time series data are ubiquitous in several domains as climate, economics and health care.
Recent conceptual approach relies on time series mapping to complex networks.
Network analysis can be used to characterize different types of time series.
arXiv Detail & Related papers (2021-10-11T13:46:28Z) - Time Series Analysis via Network Science: Concepts and Algorithms [62.997667081978825]
This review provides a comprehensive overview of existing mapping methods for transforming time series into networks.
We describe the main conceptual approaches, provide authoritative references and give insight into their advantages and limitations in a unified notation and language.
Although still very recent, this research area has much potential and with this survey we intend to pave the way for future research on the topic.
arXiv Detail & Related papers (2021-10-11T13:33:18Z) - Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with
Natural Language [112.32586622873731]
We address the problem of retrieving a specific moment from an untrimmed video by natural language.
We model the temporal context between video moments by a set of predefined two-dimensional maps under different temporal scales.
Based on the 2D temporal maps, we propose a Multi-Scale Temporal Adjacent Network (MS-2D-TAN), a single-shot framework for moment localization.
arXiv Detail & Related papers (2020-12-04T15:09:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.