Fast Segment Anything
- URL: http://arxiv.org/abs/2306.12156v1
- Date: Wed, 21 Jun 2023 10:08:29 GMT
- Title: Fast Segment Anything
- Authors: Xu Zhao, Wenchao Ding, Yongqi An, Yinglong Du, Tao Yu, Min Li, Ming
Tang, Jinqiao Wang
- Abstract summary: Recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks.
Huge computation costs prevent it from wider applications in industry scenarios.
We propose a speed-up alternative method for this fundamental task with comparable performance.
- Score: 46.130784421779865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recently proposed segment anything model (SAM) has made a significant
influence in many computer vision tasks. It is becoming a foundation step for
many high-level tasks, like image segmentation, image caption, and image
editing. However, its huge computation costs prevent it from wider applications
in industry scenarios. The computation mainly comes from the Transformer
architecture at high-resolution inputs. In this paper, we propose a speed-up
alternative method for this fundamental task with comparable performance. By
reformulating the task as segments-generation and prompting, we find that a
regular CNN detector with an instance segmentation branch can also accomplish
this task well. Specifically, we convert this task to the well-studied instance
segmentation task and directly train the existing instance segmentation method
using only 1/50 of the SA-1B dataset published by SAM authors. With our method,
we achieve a comparable performance with the SAM method at 50 times higher
run-time speed. We give sufficient experimental results to demonstrate its
effectiveness. The codes and demos will be released at
https://github.com/CASIA-IVA-Lab/FastSAM.
Related papers
- Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.
In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.
These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z) - Deep Instruction Tuning for Segment Anything Model [68.7934961590075]
Segment Anything Model (SAM) has become a research hotspot in the fields of multimedia and computer vision.
SAM can support different types of segmentation prompts, but it performs much worse on text-instructed tasks.
We propose two simple yet effective deep instruction tuning (DIT) methods for SAM, one is end-to-end and the other is layer-wise.
arXiv Detail & Related papers (2024-03-31T11:37:43Z) - OMG-Seg: Is One Model Good Enough For All Segmentation? [83.17068644513144]
OMG-Seg is a transformer-based encoder-decoder architecture with task-specific queries and outputs.
We show that OMG-Seg can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead.
arXiv Detail & Related papers (2024-01-18T18:59:34Z) - RAP-SAM: Towards Real-Time All-Purpose Segment Anything [120.17175256421622]
Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation.
Current real-time segmentation mainly has one purpose, such as semantic segmentation on the driving scene.
This work explores a new real-time segmentation setting, named all-purpose segmentation in real-time, to transfer VFMs in real-time deployment.
arXiv Detail & Related papers (2024-01-18T18:59:30Z) - TinySAM: Pushing the Envelope for Efficient Segment Anything Model [76.21007576954035]
We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance.
We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model.
We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
arXiv Detail & Related papers (2023-12-21T12:26:11Z) - RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model [29.42043345787285]
We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM)
This enables SAM to produce semantically discernible segmentation results for remote sensing images.
We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
arXiv Detail & Related papers (2023-06-28T14:51:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.