COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection
- URL: http://arxiv.org/abs/2411.18858v1
- Date: Thu, 28 Nov 2024 01:58:28 GMT
- Title: COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection
- Authors: Xiaoqin Zhang, Zhenni Yu, Li Zhao, Deng-Ping Fan, Guobao Xiao,
- Abstract summary: We propose a novel multiprompt network called COMPrompter for camouflaged object detection (COD)<n>Our network aims to enhance the single prompt strategy in SAM to a multiprompt strategy.<n>We employ the discrete wavelet transform to extract high-frequency features from image embeddings.
- Score: 42.23374375190698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We rethink the segment anything model (SAM) and propose a novel multiprompt network called COMPrompter for camouflaged object detection (COD). SAM has zero-shot generalization ability beyond other models and can provide an ideal framework for COD. Our network aims to enhance the single prompt strategy in SAM to a multiprompt strategy. To achieve this, we propose an edge gradient extraction module, which generates a mask containing gradient information regarding the boundaries of camouflaged objects. This gradient mask is then used as a novel boundary prompt, enhancing the segmentation process. Thereafter, we design a box-boundary mutual guidance module, which fosters more precise and comprehensive feature extraction via mutual guidance between a boundary prompt and a box prompt. This collaboration enhances the model's ability to accurately detect camouflaged objects. Moreover, we employ the discrete wavelet transform to extract high-frequency features from image embeddings. The high-frequency features serve as a supplementary component to the multiprompt system. Finally, our COMPrompter guides the network to achieve enhanced segmentation results, thereby advancing the development of SAM in terms of COD. Experimental results across COD benchmarks demonstrate that COMPrompter achieves a cutting-edge performance, surpassing the current leading model by an average positive metric of 2.2% in COD10K. In the specific application of COD, the experimental results in polyp segmentation show that our model is superior to top-tier methods as well. The code will be made available at https://github.com/guobaoxiao/COMPrompter.
Related papers
- DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency [91.30252180093333]
We propose the Dual Consistency SAM (DCSAM) method based on prompttuning to adapt SAM and SAM2 for in-context segmentation.
Our key insights are to enhance the features of the SAM's prompt encoder in segmentation by providing high-quality visual prompts.
Although the proposed DC-SAM is primarily designed for images, it can be seamlessly extended to the video domain with the support SAM2.
arXiv Detail & Related papers (2025-04-16T13:41:59Z) - CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection [14.219232629274186]
The application of SAM2 for automated segmentation in real-world scenarios faces challenges in camouflage perception and reliable prompts generation.
We propose CamoSAM2, a motion-appearance prompt inducer (MAPI) and refinement framework to automatically generate and refine prompts for SAM2.
Our proposed model, CamoSAM2, significantly outperforms existing state-of-the-art methods, achieving increases of 8.0% and 10.1% in mIoU metric.
arXiv Detail & Related papers (2025-04-01T02:45:17Z) - SSFam: Scribble Supervised Salient Object Detection Family [13.369217449092524]
Scribble supervised salient object detection (SSSOD) constructs segmentation ability of attractive objects from surroundings under the supervision of sparse scribble labels.
For the better segmentation, depth and thermal infrared modalities serve as the supplement to RGB images in the complex scenes.
Our model demonstrates the remarkable performance among combinations of different modalities and refreshes the highest level of scribble supervised methods.
arXiv Detail & Related papers (2024-09-07T13:07:59Z) - Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection [22.027032083786242]
DSAM exploits the zero-shot capability of SAM to realize precise segmentation in the RGB-D domain.
The Finer Module explores the possibility of accurately segmenting highly camouflaged targets from a depth perspective.
arXiv Detail & Related papers (2024-07-17T06:31:29Z) - Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM [62.85895749882285]
Marine Animal (MAS) involves segmenting animals within marine environments.
We propose a novel feature learning framework, named Dual-SAM for high-performance MAS.
Our proposed method achieves state-of-the-art performances on five widely-used MAS datasets.
arXiv Detail & Related papers (2024-04-07T15:34:40Z) - Unveiling Camouflage: A Learnable Fourier-based Augmentation for
Camouflaged Object Detection and Instance Segmentation [27.41886911999097]
We propose a learnable augmentation method for camouflaged object detection (COD) and camouflaged instance segmentation (CIS)
Our proposed augmentation method boosts the performance of camouflaged object detectors and camouflaged instance segmenters by large margins.
arXiv Detail & Related papers (2023-08-29T22:43:46Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - Matcher: Segment Anything with One Shot Using All-Purpose Feature
Matching [63.88319217738223]
We present Matcher, a novel perception paradigm that utilizes off-the-shelf vision foundation models to address various perception tasks.
Matcher demonstrates impressive generalization performance across various segmentation tasks, all without training.
Our results further showcase the open-world generality and flexibility of Matcher when applied to images in the wild.
arXiv Detail & Related papers (2023-05-22T17:59:43Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - BatchFormerV2: Exploring Sample Relationships for Dense Representation
Learning [88.82371069668147]
BatchFormerV2 is a more general batch Transformer module, which enables exploring sample relationships for dense representation learning.
BatchFormerV2 consistently improves current DETR-based detection methods by over 1.3%.
arXiv Detail & Related papers (2022-04-04T05:53:42Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.