Segment and Track Anything
- URL: http://arxiv.org/abs/2305.06558v1
- Date: Thu, 11 May 2023 04:33:08 GMT
- Title: Segment and Track Anything
- Authors: Yangming Cheng, Liulei Li, Yuanyou Xu, Xiaodi Li, Zongxin Yang,
Wenguan Wang, Yi Yang
- Abstract summary: This report presents a framework called Segment And Track Anything (SAMTrack)
SAM-Track allows users to precisely and effectively segment and track any object in a video.
It can be used across an array of fields, ranging from drone technology, autonomous driving, medical imaging, augmented reality, to biological analysis.
- Score: 57.20918630166862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This report presents a framework called Segment And Track Anything (SAMTrack)
that allows users to precisely and effectively segment and track any object in
a video. Additionally, SAM-Track employs multimodal interaction methods that
enable users to select multiple objects in videos for tracking, corresponding
to their specific requirements. These interaction methods comprise click,
stroke, and text, each possessing unique benefits and capable of being employed
in combination. As a result, SAM-Track can be used across an array of fields,
ranging from drone technology, autonomous driving, medical imaging, augmented
reality, to biological analysis. SAM-Track amalgamates Segment Anything Model
(SAM), an interactive key-frame segmentation model, with our proposed AOT-based
tracking model (DeAOT), which secured 1st place in four tracks of the VOT 2022
challenge, to facilitate object tracking in video. In addition, SAM-Track
incorporates Grounding-DINO, which enables the framework to support text-based
interaction. We have demonstrated the remarkable capabilities of SAM-Track on
DAVIS-2016 Val (92.0%), DAVIS-2017 Test (79.2%)and its practicability in
diverse applications. The project page is available at:
https://github.com/z-x-yang/Segment-and-Track-Anything.
Related papers
- Matching Anything by Segmenting Anything [109.2507425045143]
We propose MASA, a novel method for robust instance association learning.
MASA learns instance-level correspondence through exhaustive data transformations.
We show that MASA achieves even better performance than state-of-the-art methods trained with fully annotated in-domain video sequences.
arXiv Detail & Related papers (2024-06-06T16:20:07Z) - CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking
and Segmentation [31.167405688707575]
We propose a framework for instance-level visual analysis on video frames.
It can simultaneously conduct object detection, instance segmentation, and multi-object tracking.
We evaluate the proposed method extensively on KITTI MOTS and MOTS Challenge datasets.
arXiv Detail & Related papers (2023-11-02T04:32:24Z) - OmniTracker: Unifying Object Tracking by Tracking-with-Detection [119.51012668709502]
OmniTracker is presented to resolve all the tracking tasks with a fully shared network architecture, model weights, and inference pipeline.
Experiments on 7 tracking datasets, including LaSOT, TrackingNet, DAVIS16-17, MOT17, MOTS20, and YTVIS19, demonstrate that OmniTracker achieves on-par or even better results than both task-specific and unified tracking models.
arXiv Detail & Related papers (2023-03-21T17:59:57Z) - DIVOTrack: A Novel Dataset and Baseline Method for Cross-View
Multi-Object Tracking in DIVerse Open Scenes [74.64897845999677]
We introduce a new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians.
Our DIVOTrack has fifteen distinct scenarios and 953 cross-view tracks, surpassing all cross-view multi-object tracking datasets currently available.
Furthermore, we provide a novel baseline cross-view tracking method with a unified joint detection and cross-view tracking framework named CrossMOT.
arXiv Detail & Related papers (2023-02-15T14:10:42Z) - BURST: A Benchmark for Unifying Object Recognition, Segmentation and
Tracking in Video [58.71785546245467]
Multiple existing benchmarks involve tracking and segmenting objects in video.
There is little interaction between them due to the use of disparate benchmark datasets and metrics.
We propose BURST, a dataset which contains thousands of diverse videos with high-quality object masks.
All tasks are evaluated using the same data and comparable metrics, which enables researchers to consider them in unison.
arXiv Detail & Related papers (2022-09-25T01:27:35Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.