Related papers: PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

URL: http://arxiv.org/abs/2007.01549v1
Date: Fri, 3 Jul 2020 08:28:37 GMT
Title: PointTrack++ for Effective Online Multi-Object Tracking and Segmentation
Authors: Zhenbo Xu, Wei Zhang, Xiao Tan, Wei Yang, Xiangbo Su, Yuchen Yuan, Hongwu Zhang, Shilei Wen, Errui Ding, Liusheng Huang
Abstract summary: Multiple-object tracking and segmentation (MOTS) is a novel computer vision task that aims to jointly perform multiple object tracking (MOT) and instance segmentation. We present PointTrack++, an on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework. The resulting framework achieves the state-of-the-art performance on the 5th BMTT MOTChallenge.
Score: 63.825223123350874
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multiple-object tracking and segmentation (MOTS) is a novel computer vision task that aims to jointly perform multiple object tracking (MOT) and instance segmentation. In this work, we present PointTrack++, an effective on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework. To begin with, PointTrack adopts an efficient one-stage framework for instance segmentation, and learns instance embeddings by converting compact image representations to un-ordered 2D point cloud. Compared with PointTrack, our proposed PointTrack++ offers three major improvements. Firstly, in the instance segmentation stage, we adopt a semantic segmentation decoder trained with focal loss to improve the instance selection quality. Secondly, to further boost the segmentation performance, we propose a data augmentation strategy by copy-and-paste instances into training images. Finally, we introduce a better training strategy in the instance association stage to improve the distinguishability of learned instance embeddings. The resulting framework achieves the state-of-the-art performance on the 5th BMTT MOTChallenge.

Related papers

3D-Aware Instance Segmentation and Tracking in Egocentric Videos [107.10661490652822]
Egocentric videos present unique challenges for 3D scene understanding. This paper introduces a novel approach to instance segmentation and tracking in first-person video. By incorporating spatial and temporal cues, we achieve superior performance compared to state-of-the-art 2D approaches.
arXiv Detail & Related papers (2024-08-19T10:08:25Z)
Segment Anything Meets Point Tracking [116.44931239508578]
This paper presents a novel method for point-centric interactive video segmentation, empowered by SAM and long-term point tracking. We highlight the merits of point-based tracking through direct evaluation on the zero-shot open-world Unidentified Video Objects (UVO) benchmark. Our experiments on popular video object segmentation and multi-object segmentation tracking benchmarks, including DAVIS, YouTube-VOS, and BDD100K, suggest that a point-based segmentation tracker yields better zero-shot performance and efficient interactions.
arXiv Detail & Related papers (2023-07-03T17:58:01Z)
Learning Inter-Superpoint Affinity for Weakly Supervised 3D Instance Segmentation [10.968271388503986]
We propose a 3D instance segmentation framework that can achieve good performance by annotating only one point for each instance. Our method achieves state-of-the-art performance in the weakly supervised point cloud instance segmentation task, and even outperforms some fully supervised methods.
arXiv Detail & Related papers (2022-10-11T15:22:22Z)
A Discriminative Single-Shot Segmentation Network for Visual Object Tracking [13.375369415113534]
We propose a discriminative single-shot segmentation tracker -- D3S2. A single-shot network applies two target models with complementary geometric properties. D3S2 outperforms the leading segmentation tracker SiamMask on video object segmentation benchmarks.
arXiv Detail & Related papers (2021-12-22T12:48:51Z)
Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching [67.02962970820505]
We introduce "tracking-by-detection" into Video Object (VOS) We propose a new temporal aggregation network and a novel dynamic time-evolving template matching mechanism to achieve significantly improved performance. We achieve new state-of-the-art performance on the DAVIS benchmark without complicated bells and whistles in both speed and accuracy, with a speed of 0.14 second per frame and J&F measure of 75.9% respectively.
arXiv Detail & Related papers (2020-07-11T05:44:16Z)
Segment as Points for Efficient Online Multi-Object Tracking and Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation. Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images. The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z)
Few-shot 3D Point Cloud Semantic Segmentation [138.80825169240302]
We propose a novel attention-aware multi-prototype transductive few-shot point cloud semantic segmentation method. Our proposed method shows significant and consistent improvements compared to baselines in different few-shot point cloud semantic segmentation settings.
arXiv Detail & Related papers (2020-06-22T08:05:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.