Matching Anything by Segmenting Anything
- URL: http://arxiv.org/abs/2406.04221v1
- Date: Thu, 6 Jun 2024 16:20:07 GMT
- Title: Matching Anything by Segmenting Anything
- Authors: Siyuan Li, Lei Ke, Martin Danelljan, Luigi Piccinelli, Mattia Segu, Luc Van Gool, Fisher Yu,
- Abstract summary: We propose MASA, a novel method for robust instance association learning.
MASA learns instance-level correspondence through exhaustive data transformations.
We show that MASA achieves even better performance than state-of-the-art methods trained with fully annotated in-domain video sequences.
- Score: 109.2507425045143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The robust association of the same objects across video frames in complex scenes is crucial for many applications, especially Multiple Object Tracking (MOT). Current methods predominantly rely on labeled domain-specific video datasets, which limits the cross-domain generalization of learned similarity embeddings. We propose MASA, a novel method for robust instance association learning, capable of matching any objects within videos across diverse domains without tracking labels. Leveraging the rich object segmentation from the Segment Anything Model (SAM), MASA learns instance-level correspondence through exhaustive data transformations. We treat the SAM outputs as dense object region proposals and learn to match those regions from a vast image collection. We further design a universal MASA adapter which can work in tandem with foundational segmentation or detection models and enable them to track any detected objects. Those combinations present strong zero-shot tracking ability in complex domains. Extensive tests on multiple challenging MOT and MOTS benchmarks indicate that the proposed method, using only unlabeled static images, achieves even better performance than state-of-the-art methods trained with fully annotated in-domain video sequences, in zero-shot association. Project Page: https://matchinganything.github.io/
Related papers
- 1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation [72.54357831350762]
We propose a semantic embedding video object segmentation model and use the salient features of objects as query representations.
We trained our model on a large-scale video object segmentation dataset.
Our model achieves first place (textbf84.45%) in the test set of Complex Video Object Challenge.
arXiv Detail & Related papers (2024-06-07T03:13:46Z) - OMG-Seg: Is One Model Good Enough For All Segmentation? [83.17068644513144]
OMG-Seg is a transformer-based encoder-decoder architecture with task-specific queries and outputs.
We show that OMG-Seg can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead.
arXiv Detail & Related papers (2024-01-18T18:59:34Z) - Segment Anything Meets Point Tracking [116.44931239508578]
This paper presents a novel method for point-centric interactive video segmentation, empowered by SAM and long-term point tracking.
We highlight the merits of point-based tracking through direct evaluation on the zero-shot open-world Unidentified Video Objects (UVO) benchmark.
Our experiments on popular video object segmentation and multi-object segmentation tracking benchmarks, including DAVIS, YouTube-VOS, and BDD100K, suggest that a point-based segmentation tracker yields better zero-shot performance and efficient interactions.
arXiv Detail & Related papers (2023-07-03T17:58:01Z) - BURST: A Benchmark for Unifying Object Recognition, Segmentation and
Tracking in Video [58.71785546245467]
Multiple existing benchmarks involve tracking and segmenting objects in video.
There is little interaction between them due to the use of disparate benchmark datasets and metrics.
We propose BURST, a dataset which contains thousands of diverse videos with high-quality object masks.
All tasks are evaluated using the same data and comparable metrics, which enables researchers to consider them in unison.
arXiv Detail & Related papers (2022-09-25T01:27:35Z) - Multi-Object Tracking and Segmentation via Neural Message Passing [0.0]
Graphs offer a natural way to formulate Multiple Object Tracking (MOT) and Multiple Object Tracking and (MOTS)
We exploit the classical network flow formulation of MOT to define a fully differentiable framework based on Message Passing Networks (MPNs)
We achieve state-of-the-art results for both tracking and segmentation in several publicly available datasets.
arXiv Detail & Related papers (2022-07-15T13:03:47Z) - Discriminative Appearance Modeling with Multi-track Pooling for
Real-time Multi-object Tracking [20.66906781151]
In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene.
Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory.
We propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online.
arXiv Detail & Related papers (2021-01-28T18:12:39Z) - CompFeat: Comprehensive Feature Aggregation for Video Instance
Segmentation [67.17625278621134]
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video.
Previous approaches only utilize single-frame features for the detection, segmentation, and tracking of objects.
We propose a novel comprehensive feature aggregation approach (CompFeat) to refine features at both frame-level and object-level with temporal and spatial context information.
arXiv Detail & Related papers (2020-12-07T00:31:42Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.