Related papers: When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation

When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation

URL: http://arxiv.org/abs/2409.18653v1
Date: Fri, 27 Sep 2024 11:35:50 GMT
Title: When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation
Authors: Yuli Zhou, Guolei Sun, Yawei Li, Luca Benini, Ender Konukoglu,
Abstract summary: This study investigates the application and performance of the Segment Anything Model 2 (SAM2) in the challenging task of video camouflaged object segmentation (VCOS) VCOS involves detecting objects that blend seamlessly in the surroundings for videos, due to similar colors and textures, poor light conditions, etc.
Score: 36.174458990817165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study investigates the application and performance of the Segment Anything Model 2 (SAM2) in the challenging task of video camouflaged object segmentation (VCOS). VCOS involves detecting objects that blend seamlessly in the surroundings for videos, due to similar colors and textures, poor light conditions, etc. Compared to the objects in normal scenes, camouflaged objects are much more difficult to detect. SAM2, a video foundation model, has shown potential in various tasks. But its effectiveness in dynamic camouflaged scenarios remains under-explored. This study presents a comprehensive study on SAM2's ability in VCOS. First, we assess SAM2's performance on camouflaged video datasets using different models and prompts (click, box, and mask). Second, we explore the integration of SAM2 with existing multimodal large language models (MLLMs) and VCOS methods. Third, we specifically adapt SAM2 by fine-tuning it on the video camouflaged dataset. Our comprehensive experiments demonstrate that SAM2 has excellent zero-shot ability of detecting camouflaged objects in videos. We also show that this ability could be further improved by specifically adjusting SAM2's parameters for VCOS. The code will be available at https://github.com/zhoustan/SAM2-VCOS

Related papers

CamSAM2: Segment Anything Accurately in Camouflaged Videos [37.0152845263844]
We propose Camouflaged SAM2 (CamSAM2) to handle camouflaged scenes without modifying SAM2's parameters. To make full use of fine-grained and high-resolution features from the current frame and previous frames, we propose implicit object-aware fusion (IOF) and explicit object-aware fusion (EOF) modules. While CamSAM2 only adds negligible learnable parameters to SAM2, it substantially outperforms SAM2 on three VCOS datasets.
arXiv Detail & Related papers (2025-03-25T14:58:52Z)
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos [110.3379755761583]
Sa2VA is a unified model for grounded understanding of both images and videos. It supports a wide range of image and video tasks, including referring segmentation and conversation. We show that Sa2VA achieves state-of-the-art across multiple tasks, particularly in referring video object segmentation.
arXiv Detail & Related papers (2025-01-07T18:58:54Z)
When SAM2 Meets Video Shadow and Mirror Detection [3.3993877661368757]
We evaluate the effectiveness of the Segment Anything Model 2 (SAM2) on three distinct video segmentation tasks. Specifically, we use ground truth point or mask prompts to initialize the first frame and then predict corresponding masks for subsequent frames. Experimental results show that SAM2's performance on these tasks is suboptimal, especially when point prompts are used.
arXiv Detail & Related papers (2024-12-26T17:35:20Z)
Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2 [41.627959017482155]
We propose the first large-scale underwater camouflaged object tracking dataset, namely UW-COT. This paper presents an experimental evaluation of several advanced visual object tracking methods and the latest advancements in image and video segmentation.
arXiv Detail & Related papers (2024-09-25T13:10:03Z)
Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track [28.52754012142431]
Segment Anything Model 2 (SAM 2) is a foundation model towards solving promptable visual segmentation in images and videos. SAM 2 builds a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date. Without fine-tuning on the training set, SAM 2 achieved 75.79 J&F on the test set and ranked 4th place for 6th LSVOS Challenge VOS Track.
arXiv Detail & Related papers (2024-08-19T16:13:14Z)
SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation [51.90445260276897]
We prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation.
arXiv Detail & Related papers (2024-08-16T17:55:38Z)
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More [16.40994541980171]
This paper introduces SAM2-Adapter, the first adapter designed to overcome the persistent limitations observed in SAM2. It builds on the SAM-Adapter's strengths, offering enhanced generalizability and composability for diverse applications. We show the potential and encourage the research community to leverage the SAM2 model with our SAM2-Adapter.
arXiv Detail & Related papers (2024-08-08T16:40:15Z)
Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2 [10.751277821864916]
Report reveals a decline in SAM2's ability to perceive different objects in images without prompts in its auto mode. Specifically, we employ the challenging task of camouflaged object detection to assess this performance decrease.
arXiv Detail & Related papers (2024-07-31T13:32:10Z)
MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation. Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z)
Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects. In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt. These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z)
Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM [62.85895749882285]
Marine Animal (MAS) involves segmenting animals within marine environments. We propose a novel feature learning framework, named Dual-SAM for high-performance MAS. Our proposed method achieves state-of-the-art performances on five widely-used MAS datasets.
arXiv Detail & Related papers (2024-04-07T15:34:40Z)
Personalize Segment Anything Model with One Shot [52.54453744941516]
We propose a training-free Personalization approach for Segment Anything Model (SAM) Given only a single image with a reference mask, PerSAM first localizes the target concept by a location prior. PerSAM segments it within other images or videos via three techniques: target-guided attention, target-semantic prompting, and cascaded post-refinement.
arXiv Detail & Related papers (2023-05-04T17:59:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.