Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes
- URL: http://arxiv.org/abs/2407.11464v2
- Date: Fri, 19 Jul 2024 01:00:41 GMT
- Title: Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes
- Authors: Zhi Cai, Yingjie Gao, Yaoyan Zheng, Nan Zhou, Di Huang,
- Abstract summary: Crowd-SAM is a framework designed to enhance SAM's performance in crowded and occluded scenes.
We introduce an efficient prompt sampler (EPS) and a part-whole discrimination network (PWD-Net) to enhance mask selection and accuracy in crowded scenes.
Crowd-SAM rivals state-of-the-art (SOTA) fully-supervised object detection methods on several benchmarks including CrowdHuman and CityPersons.
- Score: 18.244508068200236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In computer vision, object detection is an important task that finds its application in many scenarios. However, obtaining extensive labels can be challenging, especially in crowded scenes. Recently, the Segment Anything Model (SAM) has been proposed as a powerful zero-shot segmenter, offering a novel approach to instance segmentation tasks. However, the accuracy and efficiency of SAM and its variants are often compromised when handling objects in crowded and occluded scenes. In this paper, we introduce Crowd-SAM, a SAM-based framework designed to enhance SAM's performance in crowded and occluded scenes with the cost of few learnable parameters and minimal labeled images. We introduce an efficient prompt sampler (EPS) and a part-whole discrimination network (PWD-Net), enhancing mask selection and accuracy in crowded scenes. Despite its simplicity, Crowd-SAM rivals state-of-the-art (SOTA) fully-supervised object detection methods on several benchmarks including CrowdHuman and CityPersons. Our code is available at https://github.com/FelixCaae/CrowdSAM.
Related papers
- Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities.
We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD)
Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in
Videos by Prompt Denoising [37.216493829454706]
We explore the potential of applying the Segment Anything Model to track and segment objects in videos.
Specifically, we iteratively propagate the bounding box of each object's mask in the preceding frame as the prompt for the next frame.
To enhance SAM's denoising capability against position and size variations, we propose a multi-prompt strategy.
arXiv Detail & Related papers (2024-03-07T03:52:59Z) - PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation [19.65118388712439]
We introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM)
By exclusively training the prompt adapter, PA-SAM extracts detailed information from images and optimize the mask decoder feature at both sparse and dense prompt levels.
Experimental results demonstrate that our PA-SAM outperforms other SAM-based methods in high-quality, zero-shot, and open-set segmentation.
arXiv Detail & Related papers (2024-01-23T19:20:22Z) - TinySAM: Pushing the Envelope for Efficient Segment Anything Model [76.21007576954035]
We propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance.
We first propose a full-stage knowledge distillation method with hard prompt sampling and hard mask weighting strategy to distill a lightweight student model.
We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost.
arXiv Detail & Related papers (2023-12-21T12:26:11Z) - Stable Segment Anything Model [79.9005670886038]
The Segment Anything Model (SAM) achieves remarkable promptable segmentation given high-quality prompts.
This paper presents the first comprehensive analysis on SAM's segmentation stability across a diverse spectrum of prompt qualities.
Our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality.
arXiv Detail & Related papers (2023-11-27T12:51:42Z) - On the Robustness of Segment Anything [46.669794757467166]
We aim to study the testing-time robustness of SAM under adversarial scenarios and common corruptions.
We find that SAM exhibits remarkable robustness against various corruptions, except for blur-related corruption.
arXiv Detail & Related papers (2023-05-25T16:28:30Z) - Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM)
Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior.
PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z) - Can SAM Count Anything? An Empirical Study on SAM Counting [35.42720382193184]
We explore the use of the Segment Anything model (SAM) for the challenging task of few-shot object counting.
We find that SAM's performance is unsatisfactory without further fine-tuning, particularly for small and crowded objects.
arXiv Detail & Related papers (2023-04-21T08:59:48Z) - SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in
Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and
More [13.047310918166762]
We propose textbfSAM-Adapter, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters.
We can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection.
arXiv Detail & Related papers (2023-04-18T17:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.