Related papers: SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation

SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation

URL: http://arxiv.org/abs/2503.12404v2
Date: Sun, 21 Sep 2025 01:18:09 GMT
Title: SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation
Authors: Jianhao Yang, Wenshuo Yu, Yuanchao Lv, Jiance Sun, Bokang Sun, Mingyang Liu,
Abstract summary: We introduce a novel label enhancement and automatic annotation framework, termed SAM2-ELNet.<n>Specifically, we employ the frozen Hiera backbone from the pretrained SAM2 as the encoder, while fine-tuning the adapter and decoder.<n>We design a series of experiments targeting resource-limited remote sensing tasks and evaluate our method on two datasets.
Score: 2.5292915978887387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Remote sensing image segmentation is crucial for environmental monitoring, disaster assessment, and resource management, but its performance largely depends on the quality of the dataset. Although several high-quality datasets are broadly accessible, data scarcity remains for specialized tasks like marine oil spill segmentation. Such tasks still rely on manual annotation, which is both time-consuming and influenced by subjective human factors. The segment anything model 2 (SAM2) has strong potential as an automatic annotation framework but struggles to perform effectively on heterogeneous, low-contrast remote sensing imagery. To address these challenges, we introduce a novel label enhancement and automatic annotation framework, termed SAM2-ELNet (Enhancement and Labeling Network). Specifically, we employ the frozen Hiera backbone from the pretrained SAM2 as the encoder, while fine-tuning the adapter and decoder for different remote sensing tasks. In addition, the proposed framework includes a label quality evaluator for filtering, ensuring the reliability of the generated labels. We design a series of experiments targeting resource-limited remote sensing tasks and evaluate our method on two datasets: the Deep-SAR Oil Spill (SOS) dataset with Synthetic Aperture Radar (SAR) imagery, and the CHN6-CUG Road dataset with Very High Resolution (VHR) optical imagery. The proposed framework can enhance coarse annotations and generate reliable training data under resource-limited conditions. Fine-tuned on only 30% of the training data, it generates automatically labeled data. A model trained solely on these achieves slightly lower performance than using the full original annotations, while greatly reducing labeling costs and offering a practical solution for large-scale remote sensing interpretation.

Related papers

SAM2Auto: Auto Annotation Using FLASH [13.638155035372835]
Vision-Language Models (VLMs) lag behind Large Language Models due to the scarcity of annotated datasets.<n>We introduce SAM2Auto, the first fully automated annotation pipeline for video datasets requiring no human intervention or dataset-specific training.<n>Our system employs statistical approaches to minimize detection errors while ensuring consistent object tracking throughout entire video sequences.
arXiv Detail & Related papers (2025-06-09T15:15:15Z)
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation [66.66243874361103]
dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data.<n>We propose Concept-Aware LoRA, a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts for domain alignment.<n>We demonstrate its effectiveness in generating datasets for urban-scene segmentation, outperforming baseline and state-of-the-art methods in in-domain settings.
arXiv Detail & Related papers (2025-03-28T06:23:29Z)
ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object [14.931975623642169]
Small object sizes, ambiguous features, and limited generalization make it difficult for current methods to achieve this goal.<n>We propose ROS-SAM, a method designed to achieve high-quality interactive segmentation while preserving generalization across diverse remote sensing data.<n> ROS-SAM is built upon three key innovations: 1) LoRA-based fine-tuning, which enables efficient domain adaptation while maintaining SAM's generalization ability, 2) Enhancement of deep network layers to improve the discriminability of extracted features, thereby reducing misclassifications, and 3) Integration of global context with local boundary details in the mask decoder to generate high-quality segmentation masks
arXiv Detail & Related papers (2025-03-15T06:10:09Z)
TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework. We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data. Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z)
ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model [32.91528641298171]
We introduce an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Pre-training in Remote Sensing) We leverage the Segment Anything Model (SAM) to predict precise pseudo-labels for RS images without necessitating prior annotations or additional prompts. Our approach enhances the performance of downstream tasks across various benchmarks, including iSAID and ISPRS Potsdam.
arXiv Detail & Related papers (2024-06-16T09:02:01Z)
Task Specific Pretraining with Noisy Labels for Remote Sensing Image Segmentation [18.598405597933752]
Self-supervision provides remote sensing a tool to reduce the amount of exact, human-crafted geospatial annotations. In this work, we propose to exploit noisy semantic segmentation maps for model pretraining. The results from two datasets indicate the effectiveness of task-specific supervised pretraining with noisy labels.
arXiv Detail & Related papers (2024-02-25T18:01:42Z)
Debiased Learning for Remote Sensing Data [29.794246747637104]
We propose a highly effective semi-supervised approach tailored specifically to remote sensing data. First, we adapt the FixMatch framework to remote sensing data by designing robust strong and weak augmentations suitable for this domain. Second, we develop an effective semi-supervised learning method by removing bias in imbalanced training data resulting from both actual labels and pseudo-labels predicted by the model.
arXiv Detail & Related papers (2023-12-24T03:33:30Z)
Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction from LiDAR Data with Limited Annotations [1.3243401820948064]
Building footprint maps offer promise of precise footprint extraction without extensive post-processing. Deep learning methods face challenges in generalization and label efficiency. We propose terrain-aware self-supervised learning tailored to remote sensing.
arXiv Detail & Related papers (2023-11-02T12:34:23Z)
Robust Feature Learning Against Noisy Labels [0.2082426271304908]
Mislabeled samples can significantly degrade the generalization of models. progressive self-bootstrapping is introduced to minimize the negative impact of supervision from noisy labels. Experimental results show that our proposed method can efficiently and effectively enhance model robustness under severely noisy labels.
arXiv Detail & Related papers (2023-07-10T02:55:35Z)
A generic self-supervised learning (SSL) framework for representation learning from spectra-spatial feature of unlabeled remote sensing imagery [4.397725469518669]
Self-supervised learning (SSL) enables the models to learn a representation from orders of magnitude more unlabelled data. This work has designed a novel SSL framework that is capable of learning representation from both spectra-spatial information of unlabelled data.
arXiv Detail & Related papers (2023-06-27T23:50:43Z)
Losses over Labels: Weakly Supervised Learning via Direct Loss Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning. We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label. We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z)
LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds. Our method co-designs an efficient labeling process with semi/weakly supervised learning. Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z)
A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels [96.56299163691979]
This paper focuses on a new weakly-supervised salient object detection (SOD) task under hybrid labels. To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies. Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods.
arXiv Detail & Related papers (2022-09-07T06:45:39Z)
Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations. We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z)
PointMatch: A Consistency Training Framework for Weakly Supervised Semantic Segmentation of 3D Point Clouds [117.77841399002666]
We propose a novel framework, PointMatch, that stands on both data and label, by applying consistency regularization to sufficiently probe information from data itself. The proposed PointMatch achieves the state-of-the-art performance under various weakly-supervised schemes on both ScanNet-v2 and S3DIS datasets.
arXiv Detail & Related papers (2022-02-22T07:26:31Z)
Evaluating Self and Semi-Supervised Methods for Remote Sensing Segmentation Tasks [4.7590051176368915]
We evaluate recent self and semi-supervised ML techniques that leverage unlabeled data for improving downstream task performance. These methods are especially valuable for remote sensing tasks since there is easy access to unlabeled imagery and getting ground truth labels can often be expensive.
arXiv Detail & Related papers (2021-11-19T07:41:14Z)
Active Learning for Improved Semi-Supervised Semantic Segmentation in Satellite Images [1.0152838128195467]
Semi-supervised techniques generate pseudo-labels from a small set of labeled examples. We propose to use an active learning-based sampling strategy to select a highly representative set of labeled training data. We report a 27% improvement in mIoU with as little as 2% labeled data using active learning sampling strategies.
arXiv Detail & Related papers (2021-10-15T00:29:31Z)
Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels. We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps. Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z)
EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system. It can be trained in one shot on both fully and weakly-annotated data. It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.