Related papers: Continual Learning for Segment Anything Model Adaptation

Continual Learning for Segment Anything Model Adaptation

URL: http://arxiv.org/abs/2412.06418v2
Date: Tue, 17 Dec 2024 13:49:59 GMT
Title: Continual Learning for Segment Anything Model Adaptation
Authors: Jinglong Yang, Yichen Wu, Jun Cen, Wenjian Huang, Hong Wang, Jianguo Zhang,
Abstract summary: We propose a novel Continual SAM adaptation (CoSAM) benchmark with 8 different task domains.<n>We then propose a novel simple-yet-effective Mixture of Domain Adapters (MoDA) algorithm to help the SAM encoder extract well-separated features for different task domains.<n>Our MoDA maintains highly competitive results in the natural image domain, approaching the zero-shot performance of the original SAM.
Score: 14.00191851894315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although the current different types of SAM adaptation methods have achieved promising performance for various downstream tasks, such as prompt-based ones and adapter-based ones, most of them belong to the one-step adaptation paradigm. In real-world scenarios, we are generally confronted with the dynamic scenario where the data comes in a streaming manner. Driven by the practical need, in this paper, we first propose a novel Continual SAM adaptation (CoSAM) benchmark with 8 different task domains and carefully analyze the limitations of the existing SAM one-step adaptation methods in the continual segmentation scenario. Then we propose a novel simple-yet-effective Mixture of Domain Adapters (MoDA) algorithm which utilizes the Global Feature Tokens (GFT) and Global Assistant Tokens (GAT) modules to help the SAM encoder extract well-separated features for different task domains, and then provide the accurate task-specific information for continual learning. Extensive experiments demonstrate that our proposed MoDA obviously surpasses the existing classic continual learning methods, as well as prompt-based and adapter-based approaches for continual segmentation. Moreover, after sequential learning on the CoSAM benchmark with diverse data distributions, our MoDA maintains highly competitive results in the natural image domain, approaching the zero-shot performance of the original SAM, demonstrating its superior capability in knowledge preservation. Notably, the proposed MoDA can be seamlessly integrated into various one-step adaptation methods of SAM, which can consistently bring obvious performance gains. Code is available at \url{https://github.com/yangjl1215/CoSAM}

Related papers

InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective [9.466559751950639]
The Segment Anything Model (SAM) exhibits impressive zero-shot capabilities in general tasks but struggles in specialized domains.<n>We propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge.<n>Experiments across diverse benchmarks validate InfoSAM's effectiveness in improving SAM family's performance on real-world tasks.
arXiv Detail & Related papers (2025-05-28T03:09:22Z)
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement [40.37217744643069]
We propose a universal and efficient approach by adapting SAM to the mask refinement task. Specifically, we introduce a multi-prompt excavation strategy to mine diverse input prompts for SAM. We extend our method to SAMRefiner++ by introducing an additional IoU adaption step to further boost the performance of the generic SAMRefiner on the target dataset.
arXiv Detail & Related papers (2025-02-10T18:33:15Z)
Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts [17.6980007370549]
We make the first attempt to adapt Segment Anything Model (SAM) for multi-modal semantic segmentation.<n>By training only the MoE-LoRA layers while keeping SAM's weights frozen, SAM's strong generalization and segmentation capabilities can be preserved for downstream tasks.<n>Specifically, to address cross-modal inconsistencies, we propose a novel MoE routing strategy that adaptively generates weighted features across modalities.
arXiv Detail & Related papers (2024-12-05T14:54:31Z)
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability.<n>Existing methods that directly apply SAM through prompting often overlook the domain shift issue.<n>We propose a novel Self-Perceptinon Tuning (SPT) method, aiming to enhance SAM's perception capability for anomaly segmentation.
arXiv Detail & Related papers (2024-11-26T08:33:25Z)
On Efficient Variants of Segment Anything Model: A Survey [63.127753705046]
The Segment Anything Model (SAM) is a foundational model for image segmentation tasks, known for its strong generalization across diverse applications. To address this, a variety of SAM variants have been proposed to enhance efficiency while keeping accuracy. This survey provides the first comprehensive review of these efficient SAM variants.
arXiv Detail & Related papers (2024-10-07T11:59:54Z)
TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [44.134340976905655]
This work proposes a task-adaptive prompt framework based on the Segment Anything Model (SAM) It uses a unique generative approach to prompts alongside a comprehensive model structure and specialized prototype computation. After task-specific and weighted guidance, the abundant feature information of SAM can be better learned for Cross-dominan few-shot segmentation.
arXiv Detail & Related papers (2024-09-09T07:43:58Z)
UniTTA: Unified Benchmark and Versatile Framework Towards Realistic Test-Time Adaptation [66.05528698010697]
Test-Time Adaptation aims to adapt pre-trained models to the target domain during testing. Researchers have identified various challenging scenarios and developed diverse methods to address these challenges. We propose a Unified Test-Time Adaptation benchmark, which is comprehensive and widely applicable.
arXiv Detail & Related papers (2024-07-29T15:04:53Z)
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain. This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation. We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z)
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts. We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z)
ASAM: Boosting Segment Anything Model with Adversarial Tuning [9.566046692165884]
This paper introduces ASAM, a novel methodology that amplifies a foundation model's performance through adversarial tuning. We harness the potential of natural adversarial examples, inspired by their successful implementation in natural language processing. Our approach maintains the photorealism of adversarial examples and ensures alignment with original mask annotations.
arXiv Detail & Related papers (2024-05-01T00:13:05Z)
GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation [22.344399402787644]
This paper tackles a novel yet challenging problem: how to transfer knowledge from the emerging Segment Anything Model (SAM) We propose a framework, called GoodSAM, that introduces a teacher assistant (TA) to provide semantic information, integrated with SAM to generate ensemble logits. Experiments on two benchmarks show that our GoodSAM achieves a remarkable +3.75% mIoU improvement over the state-of-the-art (SOTA) domain adaptation methods.
arXiv Detail & Related papers (2024-03-25T02:30:32Z)
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything [117.02741621686677]
This work explores a novel real-time segmentation setting called real-time multi-purpose segmentation. It contains three fundamental sub-tasks: interactive segmentation, panoptic segmentation, and video instance segmentation. We present a novel dynamic convolution-based method, Real-Time Multi-Purpose SAM (RMP-SAM) It contains an efficient encoder and an efficient decoupled adapter to perform prompt-driven decoding.
arXiv Detail & Related papers (2024-01-18T18:59:30Z)
Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.24734826209367]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model. Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics. We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z)
Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts. This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities. Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z)
Divide and Adapt: Active Domain Adaptation via Customized Learning [56.79144758380419]
We present Divide-and-Adapt (DiaNA), a new ADA framework that partitions the target instances into four categories with stratified transferable properties. With a novel data subdivision protocol based on uncertainty and domainness, DiaNA can accurately recognize the most gainful samples. Thanks to the "divideand-adapt" spirit, DiaNA can handle data with large variations of domain gap.
arXiv Detail & Related papers (2023-07-21T14:37:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.