TinySAM: Pushing the Envelope for Efficient Segment Anything Model
- URL: http://arxiv.org/abs/2312.13789v2
- Date: Sat, 9 Mar 2024 08:31:47 GMT
- Title: TinySAM: Pushing the Envelope for Efficient Segment Anything Model
- Authors: Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li,
Yunhe Wang, Xinghao Chen
- Abstract summary: We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance.
We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model.
We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
- Score: 76.21007576954035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently segment anything model (SAM) has shown powerful segmentation
capability and has drawn great attention in computer vision fields. Massive
following works have developed various applications based on the pretrained SAM
and achieved impressive performance on downstream vision tasks.
However, SAM consists of heavy architectures and requires massive
computational capacity, which hinders the further application of SAM on
computation constrained edge devices. To this end, in this paper we propose a
framework to obtain a tiny segment anything model (TinySAM) while maintaining
the strong zero-shot performance. We first propose a full-stage knowledge
distillation method with hard prompt sampling and hard mask weighting strategy
to distill a lightweight student model. We also adapt the post-training
quantization to the promptable segmentation task and further reduce the
computational cost. Moreover, a hierarchical segmenting everything strategy is
proposed to accelerate the everything inference by $2\times$ with almost no
performance degradation. With all these proposed methods, our TinySAM leads to
orders of magnitude computational reduction and pushes the envelope for
efficient segment anything task. Extensive experiments on various zero-shot
transfer tasks demonstrate the significantly advantageous performance of our
TinySAM against counterpart methods. Pre-trained models and codes are available
at https://github.com/xinghaochen/TinySAM and
https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.
Related papers
- Bilateral Sharpness-Aware Minimization for Flatter Minima [61.17349662062522]
Sharpness-Aware Minimization (SAM) enhances generalization by reducing a Max-Sharpness (MaxS)
In this paper, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote as Min-Sharpness (MinS)
By merging MaxS and MinS, we created a better FI that indicates a flatter direction during the optimization. Specially, we combine this FI with SAM into the proposed Bilateral SAM (BSAM) which finds a more flatter minimum than that of SAM.
arXiv Detail & Related papers (2024-09-20T03:01:13Z) - Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities.
We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD)
Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z) - PTQ4SAM: Post-Training Quantization for Segment Anything [28.893095276574893]
Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks.
However, as a large-scale model, the immense memory and computation costs hinder its practical deployment.
We propose a post-training quantization framework for Segment Anything Model, namely PTQ4SAM.
arXiv Detail & Related papers (2024-05-06T03:39:50Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.
In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.
These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z) - WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images.
To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters.
Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z) - EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment
Anything [36.553867358541154]
Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision applications.
We propose EfficientSAMs, light-weight SAM models that exhibits decent performance with largely reduced complexity.
Our idea is based on leveraging masked image pretraining, SAMI, which learns to reconstruct features from SAM image encoder for effective visual representation learning.
arXiv Detail & Related papers (2023-12-01T18:31:00Z) - Fast Segment Anything [46.130784421779865]
Recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks.
Huge computation costs prevent it from wider applications in industry scenarios.
We propose a speed-up alternative method for this fundamental task with comparable performance.
arXiv Detail & Related papers (2023-06-21T10:08:29Z) - Towards Efficient and Scalable Sharpness-Aware Minimization [81.22779501753695]
We propose a novel algorithm LookSAM that only periodically calculates the inner gradient ascent.
LookSAM achieves similar accuracy gains to SAM while being tremendously faster.
We are the first to successfully scale up the batch size when training Vision Transformers (ViTs)
arXiv Detail & Related papers (2022-03-05T11:53:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.