Related papers: InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

URL: http://arxiv.org/abs/2505.21920v2
Date: Tue, 03 Jun 2025 06:01:35 GMT
Title: InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective
Authors: Yuanhong Zhang, Muyao Yuan, Weizhan Zhang, Tieliang Gong, Wen Wen, Jiangyong Ying, Weijie Shi,
Abstract summary: The Segment Anything Model (SAM) exhibits impressive zero-shot capabilities in general tasks but struggles in specialized domains.<n>We propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge.<n>Experiments across diverse benchmarks validate InfoSAM's effectiveness in improving SAM family's performance on real-world tasks.
Score: 9.466559751950639
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Segment Anything Model (SAM), a vision foundation model, exhibits impressive zero-shot capabilities in general tasks but struggles in specialized domains. Parameter-efficient fine-tuning (PEFT) is a promising approach to unleash the potential of SAM in novel scenarios. However, existing PEFT methods for SAM neglect the domain-invariant relations encoded in the pre-trained model. To bridge this gap, we propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge. Specifically, we formulate the knowledge transfer process as two novel mutual information-based objectives: (i) to compress the domain-invariant relation extracted from pre-trained SAM, excluding pseudo-invariant information as possible, and (ii) to maximize mutual information between the relational knowledge learned by the teacher (pre-trained SAM) and the student (fine-tuned model). The proposed InfoSAM establishes a robust distillation framework for PEFT of SAM. Extensive experiments across diverse benchmarks validate InfoSAM's effectiveness in improving SAM family's performance on real-world tasks, demonstrating its adaptability and superiority in handling specialized scenarios.

Related papers

Continual Learning for Segment Anything Model Adaptation [14.00191851894315]
We propose a novel Continual SAM adaptation (CoSAM) benchmark with 8 different task domains.<n>We then propose a novel simple-yet-effective Mixture of Domain Adapters (MoDA) algorithm to help the SAM encoder extract well-separated features for different task domains.<n>Our MoDA maintains highly competitive results in the natural image domain, approaching the zero-shot performance of the original SAM.
arXiv Detail & Related papers (2024-12-09T11:51:28Z)
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
We propose a novel textbfSelf-textbfPerceptinon textbfTuning (textbfSPT) method for anomaly segmentation.<n>The SPT method incorporates a self-drafting tuning strategy, which generates an initial coarse draft of the anomaly mask, followed by a refinement process.
arXiv Detail & Related papers (2024-11-26T08:33:25Z)
On Efficient Variants of Segment Anything Model: A Survey [63.127753705046]
The Segment Anything Model (SAM) is a foundational model for image segmentation tasks, known for its strong generalization across diverse applications.<n>To address this, a variety of SAM variants have been proposed to enhance efficiency while keeping accuracy.<n>This survey provides the first comprehensive review of these efficient SAM variants.
arXiv Detail & Related papers (2024-10-07T11:59:54Z)
SAM-SP: Self-Prompting Makes SAM Great Again [11.109389094334894]
Segment Anything Model (SAM) has demonstrated impressive capabilities in zero-shot segmentation tasks. SAM encounters noticeably degradation performance when applied to specific domains, such as medical images. We introduce a novel self-prompting based fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM model.
arXiv Detail & Related papers (2024-08-22T13:03:05Z)
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts. We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z)
ASAM: Boosting Segment Anything Model with Adversarial Tuning [9.566046692165884]
This paper introduces ASAM, a novel methodology that amplifies a foundation model's performance through adversarial tuning. We harness the potential of natural adversarial examples, inspired by their successful implementation in natural language processing. Our approach maintains the photorealism of adversarial examples and ensures alignment with original mask annotations.
arXiv Detail & Related papers (2024-05-01T00:13:05Z)
GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation [22.344399402787644]
This paper tackles a novel yet challenging problem: how to transfer knowledge from the emerging Segment Anything Model (SAM) We propose a framework, called GoodSAM, that introduces a teacher assistant (TA) to provide semantic information, integrated with SAM to generate ensemble logits. Experiments on two benchmarks show that our GoodSAM achieves a remarkable +3.75% mIoU improvement over the state-of-the-art (SOTA) domain adaptation methods.
arXiv Detail & Related papers (2024-03-25T02:30:32Z)
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively [69.97238935096094]
The Open-Vocabulary SAM is a SAM-inspired model designed for simultaneous interactive segmentation and recognition. Our method can segment and recognize approximately 22,000 classes.
arXiv Detail & Related papers (2024-01-05T18:59:22Z)
Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.24734826209367]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model.<n>Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics.<n>We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z)
Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts. This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities. Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.