Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2
- URL: http://arxiv.org/abs/2409.16902v1
- Date: Wed, 25 Sep 2024 13:10:03 GMT
- Title: Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2
- Authors: Chunhui Zhang, Li Liu, Guanjie Huang, Hao Wen, Xi Zhou, Yanfeng Wang,
- Abstract summary: We propose the first large-scale underwater camouflaged object tracking dataset, namely UW-COT.
This paper presents an experimental evaluation of several advanced visual object tracking methods and the latest advancements in image and video segmentation.
- Score: 41.627959017482155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past decade, significant progress has been made in visual object tracking, largely due to the availability of large-scale training datasets. However, existing tracking datasets are primarily focused on open-air scenarios, which greatly limits the development of object tracking in underwater environments. To address this issue, we take a step forward by proposing the first large-scale underwater camouflaged object tracking dataset, namely UW-COT. Based on the proposed dataset, this paper presents an experimental evaluation of several advanced visual object tracking methods and the latest advancements in image and video segmentation. Specifically, we compare the performance of the Segment Anything Model (SAM) and its updated version, SAM 2, in challenging underwater environments. Our findings highlight the improvements in SAM 2 over SAM, demonstrating its enhanced capability to handle the complexities of underwater camouflaged objects. Compared to current advanced visual object tracking methods, the latest video segmentation foundation model SAM 2 also exhibits significant advantages, providing valuable insights into the development of more effective tracking technologies for underwater scenarios. The dataset will be accessible at \color{magenta}{https://github.com/983632847/Awesome-Multimodal-Object-Tracking}.
Related papers
- ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models [10.858627659431928]
Service robots must effectively recognize and segment unknown objects to enhance their functionality.
Traditional supervised learningbased segmentation techniques require extensive annotated datasets.
This paper proposes a novel approach (ZISVFM) for solving UOIS by leveraging the powerful zero-shot capability of the segment anything model (SAM) and explicit visual representations from a selfsupervised vision transformer (ViT)
arXiv Detail & Related papers (2025-02-05T15:22:20Z) - When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation [36.174458990817165]
This study investigates the application and performance of the Segment Anything Model 2 (SAM2) in the challenging task of video camouflaged object segmentation (VCOS)
VCOS involves detecting objects that blend seamlessly in the surroundings for videos, due to similar colors and textures, poor light conditions, etc.
arXiv Detail & Related papers (2024-09-27T11:35:50Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - Camouflaged Object Tracking: A Benchmark [16.07670491479613]
We introduce the Camouflaged Object Tracking dataset (COTD), a benchmark for evaluating camouflaged object tracking methods.
COTD comprises 200 sequences and approximately 80,000 frames, each annotated with detailed bounding boxes.
Our evaluation of 20 existing tracking algorithms reveals significant deficiencies in their performance with camouflaged objects.
We propose a novel tracking framework, HiPTrack-MLS, which demonstrates promising results in improving tracking performance for camouflaged objects.
arXiv Detail & Related papers (2024-08-25T15:56:33Z) - Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.
In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.
These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z) - BrackishMOT: The Brackish Multi-Object Tracking Dataset [20.52569822945148]
There exist no publicly available annotated underwater multi-object tracking (MOT) datasets captured in turbid environments.
BrackishMOT consists of 98 sequences captured in the wild. Alongside the novel dataset, we present baseline results by training a state-of-the-art tracker.
We analyse the effects of including synthetic data during training and show that a combination of real and synthetic underwater training data can enhance tracking performance.
arXiv Detail & Related papers (2023-02-21T13:02:36Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Visual Tracking by TridentAlign and Context Embedding [71.60159881028432]
We propose novel TridentAlign and context embedding modules for Siamese network-based visual tracking methods.
The performance of the proposed tracker is comparable to that of state-of-the-art trackers, while the proposed tracker runs at real-time speed.
arXiv Detail & Related papers (2020-07-14T08:00:26Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.