Related papers: Promptable Foundation Models for SAR Remote Sensing: Adapting the Segment Anything Model for Snow Avalanche Segmentation

Promptable Foundation Models for SAR Remote Sensing: Adapting the Segment Anything Model for Snow Avalanche Segmentation

URL: http://arxiv.org/abs/2601.01213v1
Date: Sat, 03 Jan 2026 15:41:12 GMT
Title: Promptable Foundation Models for SAR Remote Sensing: Adapting the Segment Anything Model for Snow Avalanche Segmentation
Authors: Riccardo Gelato, Carlo Sgaravatti, Jakob Grahn, Giacomo Boracchi, Filippo Maria Bianchi,
Abstract summary: Training an effective detection model requires gathering a large dataset with high-quality annotations from domain experts.<n>We build on the Segment Anything Model (SAM), a segmentation foundation model trained on natural images, and tailor it to Sentinel-1 SAR data.<n>We tackle these challenges through a combination of adapters to mitigate the domain gap, multiple encoders to handle multi-channel SAR inputs, prompt-engineering strategies to improve avalanche localization accuracy, and a training algorithm that limits the training time of the encoder.
Score: 18.28485164485434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Remote sensing solutions for avalanche segmentation and mapping are key to supporting risk forecasting and mitigation in mountain regions. Synthetic Aperture Radar (SAR) imagery from Sentinel-1 can be effectively used for this task, but training an effective detection model requires gathering a large dataset with high-quality annotations from domain experts, which is prohibitively time-consuming. In this work, we aim to facilitate and accelerate the annotation of SAR images for avalanche mapping. We build on the Segment Anything Model (SAM), a segmentation foundation model trained on natural images, and tailor it to Sentinel-1 SAR data. Adapting SAM to our use-case requires addressing several domain-specific challenges: (i) domain mismatch, since SAM was not trained on satellite/SAR imagery; (ii) input adaptation, because SAR products typically provide more than three channels, while SAM is constrained to RGB images; (iii) robustness to imprecise prompts that can affect target identification and degrade the segmentation quality, an issue exacerbated in small, low-contrast avalanches; and (iv) training efficiency, since standard fine-tuning is computationally demanding for SAM. We tackle these challenges through a combination of adapters to mitigate the domain gap, multiple encoders to handle multi-channel SAR inputs, prompt-engineering strategies to improve avalanche localization accuracy, and a training algorithm that limits the training time of the encoder, which is recognized as the major bottleneck. We integrate the resulting model into an annotation tool and show experimentally that it speeds up the annotation of SAR images.

Related papers

SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images [51.42466259821335]
We present a preliminary exploration of applying SAM 3 to the remote sensing OVSS task without any training.<n>First, we implement a mask fusion strategy that combines the outputs from SAM 3's semantic segmentation head and the Transformer decoder.<n>Second, we utilize the presence score from the presence head to filter out categories that do not exist in the scene.
arXiv Detail & Related papers (2025-12-09T15:42:28Z)
TASAM: Terrain-and-Aware Segment Anything Model for Temporal-Scale Remote Sensing Segmentation [20.89385225170904]
Segment Anything Model (SAM) has demonstrated impressive zero-shot segmentation capabilities across natural image domains.<n>We introduce TASAM, a terrain and temporally-aware extension of SAM designed specifically for high-resolution remote sensing image segmentation.
arXiv Detail & Related papers (2025-09-19T09:24:24Z)
Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images [51.74614065919118]
This paper introduces SegEarth-OV, the first framework for annotation-free open-vocabulary segmentation of RS images.<n>We propose SimFeatUp, a universal upsampler that robustly restores high-resolution spatial details from coarse features.<n>We also present a simple yet effective Global Bias Alleviation operation to subtract the inherent global context from patch features.
arXiv Detail & Related papers (2025-08-25T14:22:57Z)
ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial Attack [56.0056378072843]
We show that highly transferable adversarial images can be crafted solely via the upstream model.<n>We propose ForensicsSAM, a unified IFDL framework with built-in adversarial robustness.
arXiv Detail & Related papers (2025-08-10T16:03:44Z)
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling [41.24071764578782]
Object detection in satellite-borne Synthetic Aperture Radar imagery holds immense potential in tasks such as urban monitoring and disaster response.<n>The detection of small objects in satellite-borne SAR images poses a particularly intricate problem, because of the technology's relatively low spatial resolution and inherent noise.<n>In this paper, we introduce TRANSAR, a novel self-supervised end-to-end vision transformer-based SAR object detection model.
arXiv Detail & Related papers (2025-04-17T19:44:05Z)
Joint-Optimized Unsupervised Adversarial Domain Adaptation in Remote Sensing Segmentation with Prompted Foundation Model [32.03242732902217]
This paper addresses the challenge of adapting a model trained on source domain data to target domain samples. We propose a joint-optimized adversarial network incorporating the "Segment Anything Model (SAM) (SAM-JOANet)"
arXiv Detail & Related papers (2024-11-08T02:15:20Z)
Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks. To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z)
RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation [10.37240769959699]
Segment Anything Model (SAM) provides a universal pre-training model for image segmentation tasks. We propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic, as a tailored modification of SAM for the remote sensing field. Adapter-Scale, a set of supplementary scaling modules, are proposed in the multi-head attention blocks of the encoder part of SAM. Experiments are conducted on four distinct remote sensing scenarios, encompassing cloud detection, field monitoring, building detection and road mapping tasks.
arXiv Detail & Related papers (2024-02-29T09:55:46Z)
ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation [6.229326337093342]
Segment Anything Model (SAM) excels in various segmentation scenarios relying on semantic information and generalization ability. The ClassWiseSAM-Adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne Synthetic Aperture Radar (SAR) images. CWSAM showcases enhanced performance with fewer computing resources.
arXiv Detail & Related papers (2024-01-04T15:54:45Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts. This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities. Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.