Related papers: TinySAM: Pushing the Envelope for Efficient Segment Anything Model

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

URL: http://arxiv.org/abs/2312.13789v2
Date: Sat, 9 Mar 2024 08:31:47 GMT
Title: TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Authors: Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li, Yunhe Wang, Xinghao Chen
Abstract summary: We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model. We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
Score: 76.21007576954035
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pretrained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model. We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by $2\times$ with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Pre-trained models and codes are available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.

Related papers

SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement [40.37217744643069]
We propose a universal and efficient approach by adapting SAM to the mask refinement task. Specifically, we introduce a multi-prompt excavation strategy to mine diverse input prompts for SAM. We extend our method to SAMRefiner++ by introducing an additional IoU adaption step to further boost the performance of the generic SAMRefiner on the target dataset.
arXiv Detail & Related papers (2025-02-10T18:33:15Z)
Bilateral Sharpness-Aware Minimization for Flatter Minima [61.17349662062522]
Sharpness-Aware Minimization (SAM) enhances generalization by reducing a Max-Sharpness (MaxS) In this paper, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote as Min-Sharpness (MinS) By merging MaxS and MinS, we created a better FI that indicates a flatter direction during the optimization. Specially, we combine this FI with SAM into the proposed Bilateral SAM (BSAM) which finds a more flatter minimum than that of SAM.
arXiv Detail & Related papers (2024-09-20T03:01:13Z)
Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities. We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD) Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z)
PTQ4SAM: Post-Training Quantization for Segment Anything [28.893095276574893]
Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks. However, as a large-scale model, the immense memory and computation costs hinder its practical deployment. We propose a post-training quantization framework for Segment Anything Model, namely PTQ4SAM.
arXiv Detail & Related papers (2024-05-06T03:39:50Z)
MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation. Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z)
Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects. In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt. These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z)
WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images. To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters. Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z)
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything [36.553867358541154]
Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision applications. We propose EfficientSAMs, light-weight SAM models that exhibits decent performance with largely reduced complexity. Our idea is based on leveraging masked image pretraining, SAMI, which learns to reconstruct features from SAM image encoder for effective visual representation learning.
arXiv Detail & Related papers (2023-12-01T18:31:00Z)
Fast Segment Anything [46.130784421779865]
Recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks. Huge computation costs prevent it from wider applications in industry scenarios. We propose a speed-up alternative method for this fundamental task with comparable performance.
arXiv Detail & Related papers (2023-06-21T10:08:29Z)
Towards Efficient and Scalable Sharpness-Aware Minimization [81.22779501753695]
We propose a novel algorithm LookSAM that only periodically calculates the inner gradient ascent. LookSAM achieves similar accuracy gains to SAM while being tremendously faster. We are the first to successfully scale up the batch size when training Vision Transformers (ViTs)
arXiv Detail & Related papers (2022-03-05T11:53:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.