MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps
- URL: http://arxiv.org/abs/2411.06971v1
- Date: Mon, 11 Nov 2024 13:18:45 GMT
- Title: MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps
- Authors: Xue Xia, Daiwei Zhang, Wenxuan Song, Wei Huang, Lorenz Hurni,
- Abstract summary: We introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for historical map segmentation tasks.
Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder.
We develop an automatic prompt generation process, eliminating the need for manual input.
- Score: 6.414068793245697
- License:
- Abstract: Automated feature detection in historical maps can significantly accelerate the reconstruction of the geospatial past. However, this process is often constrained by the time-consuming task of manually digitizing sufficient high-quality training data. The emergence of visual foundation models, such as the Segment Anything Model (SAM), offers a promising solution due to their remarkable generalization capabilities and rapid adaptation to new data distributions. Despite this, directly applying SAM in a zero-shot manner to historical map segmentation poses significant challenges, including poor recognition of certain geospatial features and a reliance on input prompts, which limits its ability to be fully automated. To address these challenges, we introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for various downstream historical map segmentation tasks. Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder. Additionally, we develop an automatic prompt generation process, eliminating the need for manual input. We further enhance the positional prompt in SAM, transforming it into a higher-level positional-semantic prompt, and modify the cross-attention mechanism in the mask decoder with masked attention for more effective feature aggregation. The proposed MapSAM framework demonstrates promising performance across two distinct historical map segmentation tasks: one focused on linear features and the other on areal features. Experimental results show that it adapts well to various features, even when fine-tuned with extremely limited data (e.g. 10 shots).
Related papers
- Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability.
Existing methods that directly apply SAM through prompting often overlook the domain shift issue.
We propose a novel Self-Perceptinon Tuning (SPT) method, aiming to enhance SAM's perception capability for anomaly segmentation.
arXiv Detail & Related papers (2024-11-26T08:33:25Z) - Joint-Optimized Unsupervised Adversarial Domain Adaptation in Remote Sensing Segmentation with Prompted Foundation Model [32.03242732902217]
This paper addresses the challenge of adapting a model trained on source domain data to target domain samples.
We propose a joint-optimized adversarial network incorporating the "Segment Anything Model (SAM) (SAM-JOANet)"
arXiv Detail & Related papers (2024-11-08T02:15:20Z) - AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model [28.343378406337077]
We propose an automated prompting and mask calibration method called AM-SAM.
Our approach automatically generates prompts for an input image, eliminating the need for human involvement with a good performance in early training epochs.
Our experimental results demonstrate that AM-SAM achieves significantly accurate segmentation, matching or exceeding the effectiveness of human-generated and default prompts.
arXiv Detail & Related papers (2024-10-13T03:47:20Z) - Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts [14.631774737903015]
Existing perception models achieve great success by learning from large amounts of labeled data, but they still struggle with open-world scenarios.
We present textiti.e., open-ended object detection, which discovers unseen objects without any object categories as inputs.
We show that our method surpasses the previous open-ended method on the object detection task and can provide additional instance segmentation masks.
arXiv Detail & Related papers (2024-10-08T12:15:08Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model [15.803614800117781]
We propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks.
By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced.
Our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts.
arXiv Detail & Related papers (2024-05-03T11:16:27Z) - Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts.
This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities.
Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - Improving Fine-Grained Visual Recognition in Low Data Regimes via
Self-Boosting Attention Mechanism [27.628260249895973]
Self-boosting attention mechanism (SAM) is a novel method for regularizing the network to focus on the key regions shared across samples and classes.
We develop a variant by using SAM to create multiple attention maps to pool convolutional maps in a style of bilinear pooling.
arXiv Detail & Related papers (2022-08-01T05:36:27Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.