3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
- URL: http://arxiv.org/abs/2406.04002v2
- Date: Fri, 7 Jun 2024 00:50:38 GMT
- Title: 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
- Authors: Ruipu Wu, Jifei Che, Han Li, Chengjing Wu, Ting Liu, Luoqi Liu,
- Abstract summary: We introduce a comprehensive approach centered on the query-wise ensemble, supplemented by additional techniques.
Our proposed approach achieved a VPQ score of 57.01 on the VIPSeg test set, and ranked 3rd in the VPS track of the 3rd Pixel-level Video Understanding in the Wild Challenge.
- Score: 19.071113992267826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video panoptic segmentation is an advanced task that extends panoptic segmentation by applying its concept to video sequences. In the hope of addressing the challenge of video panoptic segmentation in diverse conditions, We utilize DVIS++ as our baseline model and enhance it by introducing a comprehensive approach centered on the query-wise ensemble, supplemented by additional techniques. Our proposed approach achieved a VPQ score of 57.01 on the VIPSeg test set, and ranked 3rd in the VPS track of the 3rd Pixel-level Video Understanding in the Wild Challenge.
Related papers
- Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation [98.11452697097539]
In this technical report, we detail our first-place solution for the 2024 Open dataset Challenge's semantic segmentation track.
We significantly enhanced the performance of Point Transformer V3 on the benchmark by implementing cutting-edge, plug-and-play training and inference technologies.
This approach secured us the top position on the Open dataset segmentation leaderboard, markedly outperforming other entries.
arXiv Detail & Related papers (2024-07-21T22:08:52Z) - 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation [81.50620771207329]
We investigate the effectiveness of static-dominant data and frame sampling on referring video object segmentation (RVOS)
Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge.
arXiv Detail & Related papers (2024-06-11T08:05:26Z) - 1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation [11.331198234997714]
Third Pixel-level Video Understanding in the Wild (PVUW CVPR 2024) challenge aims to advance the state of art in video understanding.
This paper details our research work that achieved the 1st place winner in the PVUW'24 VPS challenge.
Our solution stands on the shoulders of giant vision transformer model (DINOv2 ViT-g) and proven multi-stage Decoupled Video Instance frameworks.
arXiv Detail & Related papers (2024-06-08T04:43:08Z) - 2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation [12.274092278786966]
Video Panoptic (VPS) aims to simultaneously classify, track, segment all objects in a video.
We propose a robust integrated video panoptic segmentation solution.
Our method achieves state-of-the-art performance with a VPQ score of 56.36 and 57.12 in the development and test phases.
arXiv Detail & Related papers (2024-06-01T17:03:16Z) - VideoPrism: A Foundational Visual Encoder for Video Understanding [90.01845485201746]
VideoPrism is a general-purpose video encoder that tackles diverse video understanding tasks with a single frozen model.
We pretrain VideoPrism on a heterogeneous corpus containing 36M high-quality video-caption pairs and 582M video clips with noisy parallel text.
We extensively test VideoPrism on four broad groups of video understanding tasks, from web video question answering to CV for science, achieving state-of-the-art performance on 31 out of 33 video understanding benchmarks.
arXiv Detail & Related papers (2024-02-20T18:29:49Z) - 3rd Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation [10.04177400017471]
We propose a robust integrated video panoptic segmentation solution.
In our solution, we represent both semantic and instance targets as a set of queries.
We then combine these queries with video features extracted by neural networks to predict segmentation masks.
arXiv Detail & Related papers (2023-06-11T19:44:40Z) - 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation [25.235404527487784]
Video panoptic segmentation is a challenging task that serves as the cornerstone of numerous downstream applications.
We believe that the decoupling strategy proposed by DVIS enables more effective utilization of temporal information for both "thing" and "stuff" objects.
Our method achieved a VPQ score of 51.4 and 53.7 in the development and test phases, respectively, and ranked 1st in the VPS track of the 2nd PVUW Challenge.
arXiv Detail & Related papers (2023-06-07T01:24:48Z) - 3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic
Segmentation on VSPW [68.56017675820897]
In this paper, we introduce 3rd place solution for PVUW2023 VSS track.
We have explored various image-level visual backbones and segmentation heads to tackle the problem of video semantic segmentation.
arXiv Detail & Related papers (2023-06-04T07:50:38Z) - A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications.
Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z) - Video Panoptic Segmentation [117.08520543864054]
We propose and explore a new video extension of this task, called video panoptic segmentation.
To invigorate research on this new task, we present two types of video panoptic datasets.
We propose a novel video panoptic segmentation network (VPSNet) which jointly predicts object classes, bounding boxes, masks, instance id tracking, and semantic segmentation in video frames.
arXiv Detail & Related papers (2020-06-19T19:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.