Related papers: Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models

Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models

URL: http://arxiv.org/abs/2506.21826v1
Date: Fri, 27 Jun 2025 00:07:21 GMT
Title: Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models
Authors: Rafael Sterzinger, Marco Peer, Robert Sablatnig,
Abstract summary: We propose a simple yet effective approach for few-shot segmentation of historical maps.<n>We leverage the rich semantic embeddings of large vision foundation models combined with parameter-efficient fine-tuning.<n>Our approach enables precise segmentation of diverse historical maps while drastically reducing the need for manual annotations.
Score: 1.024113475677323
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As rich sources of history, maps provide crucial insights into historical changes, yet their diverse visual representations and limited annotated data pose significant challenges for automated processing. We propose a simple yet effective approach for few-shot segmentation of historical maps, leveraging the rich semantic embeddings of large vision foundation models combined with parameter-efficient fine-tuning. Our method outperforms the state-of-the-art on the Siegfried benchmark dataset in vineyard and railway segmentation, achieving +5% and +13% relative improvements in mIoU in 10-shot scenarios and around +20% in the more challenging 5-shot setting. Additionally, it demonstrates strong performance on the ICDAR 2021 competition dataset, attaining a mean PQ of 67.3% for building block segmentation, despite not being optimized for this shape-sensitive metric, underscoring its generalizability. Notably, our approach maintains high performance even in extremely low-data regimes (10- & 5-shot), while requiring only 689k trainable parameters - just 0.21% of the total model size. Our approach enables precise segmentation of diverse historical maps while drastically reducing the need for manual annotations, advancing automated processing and analysis in the field. Our implementation is publicly available at: https://github.com/RafaelSterzinger/few-shot-map-segmentation.

Related papers

Semantic Segmentation for Sequential Historical Maps by Learning from Only One Map [0.4915744683251151]
We propose an automated approach to digitization using deep-learning-based semantic segmentation.<n>A key challenge in this process is the lack of ground-truth annotations required for training deep neural networks.<n>We introduce a weakly-supervised age-tracing strategy for model fine-tuning.
arXiv Detail & Related papers (2025-01-03T14:55:22Z)
Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps [16.35356981558991]
We propose a novel approach that combines segmentation and association of geographic entities in historical maps using video instance segmentation (VIS) To mitigate this challenge, we explore self-supervised learning (SSL) techniques to enhance VIS performance on historical maps.
arXiv Detail & Related papers (2024-11-26T13:31:51Z)
MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps [6.414068793245697]
We introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for historical map segmentation tasks. Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder. We develop an automatic prompt generation process, eliminating the need for manual input.
arXiv Detail & Related papers (2024-11-11T13:18:45Z)
Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data. We propose an augmentation technique called "Organ Transplantation" to enhance generalizability. Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z)
Reconsidering utility: unveiling the limitations of synthetic mobility data generation algorithms in real-life scenarios [49.1574468325115]
We evaluate the utility of five state-of-the-art synthesis approaches in terms of real-world applicability. We focus on so-called trip data that encode fine granular urban movements such as GPS-tracked taxi rides. One model fails to produce data within reasonable time and another generates too many jumps to meet the requirements for map matching.
arXiv Detail & Related papers (2024-07-03T16:08:05Z)
Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence [80.6840060272386]
This paper identifies the importance of being geometry-aware for semantic correspondence. We show that incorporating this information can markedly enhance semantic correspondence performance. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset.
arXiv Detail & Related papers (2023-11-28T18:45:13Z)
Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons" We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets. Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z)
Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings [44.4879068879732]
This paper presents a complete pipeline for resolving ambiguities during the data association. Its core is a robust self-tuning data association that adapts the search area depending on the entropy of the measurements. We evaluate our method on real data from urban and rural scenarios around the city of Karlsruhe in Germany.
arXiv Detail & Related papers (2022-07-28T12:29:39Z)
PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map [58.53373202647576]
We propose PreTraM, a self-supervised pre-training scheme for trajectory forecasting. It consists of two parts: 1) Trajectory-Map Contrastive Learning, where we project trajectories and maps to a shared embedding space with cross-modal contrastive learning, and 2) Map Contrastive Learning, where we enhance map representation with contrastive learning on large quantities of HD-maps. On top of popular baselines such as AgentFormer and Trajectron++, PreTraM boosts their performance by 5.5% and 6.9% relatively in FDE-10 on the challenging nuScenes dataset.
arXiv Detail & Related papers (2022-04-21T23:01:21Z)
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains. We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images. A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
A Comprehensive Comparison of End-to-End Approaches for Handwritten Digit String Recognition [21.522563264752577]
We evaluate different end-to-end approaches to solve the HDSR problem, particularly in two verticals: those based on object-detection and sequence-to-sequence representation. Our results show that the Yolo model compares favorably against segmentation-free models with the advantage of having a shorter pipeline.
arXiv Detail & Related papers (2020-10-29T19:38:08Z)
Objectness-Aware Few-Shot Semantic Segmentation [31.13009111054977]
We show how to increase overall model capacity to achieve improved performance. We introduce objectness, which is class-agnostic and so not prone to overfitting. Given only one annotated example of an unseen category, experiments show that our method outperforms state-of-art methods with respect to mIoU.
arXiv Detail & Related papers (2020-04-06T19:12:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.