Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection
- URL: http://arxiv.org/abs/2407.12339v1
- Date: Wed, 17 Jul 2024 06:31:29 GMT
- Title: Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection
- Authors: Zhenni Yu, Xiaoqin Zhang, Li Zhao, Yi Bin, Guobao Xiao,
- Abstract summary: DSAM exploits the zero-shot capability of SAM to realize precise segmentation in the RGB-D domain.
The Finer Module explores the possibility of accurately segmenting highly camouflaged targets from a depth perspective.
- Score: 22.027032083786242
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a new Segment Anything Model with Depth Perception (DSAM) for Camouflaged Object Detection (COD). DSAM exploits the zero-shot capability of SAM to realize precise segmentation in the RGB-D domain. It consists of the Prompt-Deeper Module and the Finer Module. The Prompt-Deeper Module utilizes knowledge distillation and the Bias Correction Module to achieve the interaction between RGB features and depth features, especially using depth features to correct erroneous parts in RGB features. Then, the interacted features are combined with the box prompt in SAM to create a prompt with depth perception. The Finer Module explores the possibility of accurately segmenting highly camouflaged targets from a depth perspective. It uncovers depth cues in areas missed by SAM through mask reversion, self-filtering, and self-attention operations, compensating for its defects in the COD domain. DSAM represents the first step towards the SAM-based RGB-D COD model. It maximizes the utilization of depth features while synergizing with RGB features to achieve multimodal complementarity, thereby overcoming the segmentation limitations of SAM and improving its accuracy in COD. Experimental results on COD benchmarks demonstrate that DSAM achieves excellent segmentation performance and reaches the state-of-the-art (SOTA) on COD benchmarks with less consumption of training resources. The code will be available at https://github.com/guobaoxiao/DSAM.
Related papers
- RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory [34.406308400305385]
RGB-Depth (RGB-D) Video Object (VOS) aims to integrate the fine-grained texture information of RGB with the geometric clues of depth modality.
In this paper, we propose a novel RGB-D VOS via multi-store feature memory for robust segmentation.
We show that the proposed method state-of-the-art performance on the latest RGB-D VOS benchmark.
arXiv Detail & Related papers (2025-04-23T07:31:37Z) - COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection [42.23374375190698]
We propose a novel multiprompt network called COMPrompter for camouflaged object detection (COD)
Our network aims to enhance the single prompt strategy in SAM to a multiprompt strategy.
We employ the discrete wavelet transform to extract high-frequency features from image embeddings.
arXiv Detail & Related papers (2024-11-28T01:58:28Z) - SSFam: Scribble Supervised Salient Object Detection Family [13.369217449092524]
Scribble supervised salient object detection (SSSOD) constructs segmentation ability of attractive objects from surroundings under the supervision of sparse scribble labels.
For the better segmentation, depth and thermal infrared modalities serve as the supplement to RGB images in the complex scenes.
Our model demonstrates the remarkable performance among combinations of different modalities and refreshes the highest level of scribble supervised methods.
arXiv Detail & Related papers (2024-09-07T13:07:59Z) - Segment Anything with Multiple Modalities [61.74214237816402]
We develop MM-SAM, which supports cross-modal and multi-modal processing for robust and enhanced segmentation with different sensor suites.
MM-SAM features two key designs, namely, unsupervised cross-modal transfer and weakly-supervised multi-modal fusion.
It addresses three main challenges: 1) adaptation toward diverse non-RGB sensors for single-modal processing, 2) synergistic processing of multi-modal data via sensor fusion, and 3) mask-free training for different downstream tasks.
arXiv Detail & Related papers (2024-08-17T03:45:40Z) - Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection [58.241593208031816]
Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities.
We propose a Multi-scale and Detail-enhanced SAM (MDSAM) for Salient Object Detection (SOD)
Experimental results demonstrate the superior performance of our model on multiple SOD datasets.
arXiv Detail & Related papers (2024-08-08T09:09:37Z) - IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection [55.554484379021524]
Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images.
We propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects.
arXiv Detail & Related papers (2024-07-10T10:17:57Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM [62.85895749882285]
Marine Animal (MAS) involves segmenting animals within marine environments.
We propose a novel feature learning framework, named Dual-SAM for high-performance MAS.
Our proposed method achieves state-of-the-art performances on five widely-used MAS datasets.
arXiv Detail & Related papers (2024-04-07T15:34:40Z) - Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection [8.476593072868056]
SAM is a segmentation model recently released by Meta AI Research.
We try to ask if SAM can address the camouflage object detection (COD) task and evaluate the performance of SAM on the COD benchmark.
We also compare SAM's performance with 22 state-of-the-art COD methods.
arXiv Detail & Related papers (2023-04-10T17:05:58Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Depth-Guided Camouflaged Object Detection [31.99397550848777]
Research in biology suggests that depth can provide useful object localization cues for camouflaged object discovery.
depth information has not been exploited for camouflaged object detection.
We present a depth-guided camouflaged object detection network with pre-computed depth maps from existing monocular depth estimation methods.
arXiv Detail & Related papers (2021-06-24T17:51:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.