Related papers: 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

URL: http://arxiv.org/abs/2306.04091v2
Date: Thu, 8 Jun 2023 08:19:27 GMT
Title: 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Authors: Tao Zhang and Xingye Tian and Haoran Wei and Yu Wu and Shunping Ji and Xuebo Wang and Xin Tao and Yuan Zhang and Pengfei Wan
Abstract summary: Video panoptic segmentation is a challenging task that serves as the cornerstone of numerous downstream applications. We believe that the decoupling strategy proposed by DVIS enables more effective utilization of temporal information for both "thing" and "stuff" objects. Our method achieved a VPQ score of 51.4 and 53.7 in the development and test phases, respectively, and ranked 1st in the VPS track of the 2nd PVUW Challenge.
Score: 25.235404527487784
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video panoptic segmentation is a challenging task that serves as the cornerstone of numerous downstream applications, including video editing and autonomous driving. We believe that the decoupling strategy proposed by DVIS enables more effective utilization of temporal information for both "thing" and "stuff" objects. In this report, we successfully validated the effectiveness of the decoupling strategy in video panoptic segmentation. Finally, our method achieved a VPQ score of 51.4 and 53.7 in the development and test phases, respectively, and ultimately ranked 1st in the VPS track of the 2nd PVUW Challenge. The code is available at https://github.com/zhang-tao-whu/DVIS

Related papers

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild [164.8093566483583]
This report provides a comprehensive overview of the 4th Pixel-level Video Understanding in the Wild (PVUW) Challenge, held in conjunction with CVPR 2025. The challenge features two tracks: MOSE, which focuses on complex scene video object segmentation, and MeViS, which targets motion-guided, language-based video segmentation.
arXiv Detail & Related papers (2025-04-15T16:02:47Z)
CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track [35.70400178294299]
We introduce the solution of our team "yuanjie" for video object segmentation in the 6-th LSVOS Challenge VOS Track at ECCV 2024. We believe that our proposed CSS-Segment will perform better in videos of complex object motion and long-term presentation. Our method achieved a J&F score of 80.84 in and test phases, and ranked 2nd in the 6-th LSVOS Challenge VOS Track at ECCV 2024.
arXiv Detail & Related papers (2024-08-24T13:47:56Z)
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation [81.50620771207329]
We investigate the effectiveness of static-dominant data and frame sampling on referring video object segmentation (RVOS) Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge.
arXiv Detail & Related papers (2024-06-11T08:05:26Z)
3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation [19.071113992267826]
We introduce a comprehensive approach centered on the query-wise ensemble, supplemented by additional techniques. Our proposed approach achieved a VPQ score of 57.01 on the VIPSeg test set, and ranked 3rd in the VPS track of the 3rd Pixel-level Video Understanding in the Wild Challenge.
arXiv Detail & Related papers (2024-06-06T12:22:56Z)
3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation [63.199793919573295]
Video Object (VOS) is a vital task in computer vision, focusing on distinguishing foreground objects from the background across video frames. Our work draws inspiration from the Cutie model, and we investigate the effects of object memory, the total number of memory frames, and input resolution on segmentation performance.
arXiv Detail & Related papers (2024-06-06T00:56:25Z)
2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation [12.274092278786966]
Video Panoptic (VPS) aims to simultaneously classify, track, segment all objects in a video. We propose a robust integrated video panoptic segmentation solution. Our method achieves state-of-the-art performance with a VPQ score of 56.36 and 57.12 in the development and test phases.
arXiv Detail & Related papers (2024-06-01T17:03:16Z)
1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation [25.587080499097425]
We present further improvements to the SOTA VIS method, DVIS. We introduce a denoising training strategy for the trainable tracker, allowing it to achieve more stable and accurate object tracking in complex and long videos. Our method achieves 57.9 AP and 56.0 AP in the development and test phases, respectively, and ranked 1st in the VIS track of the 5th LSVOS Challenge.
arXiv Detail & Related papers (2023-08-28T08:15:43Z)
3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic Segmentation on VSPW [68.56017675820897]
In this paper, we introduce 3rd place solution for PVUW2023 VSS track. We have explored various image-level visual backbones and segmentation heads to tackle the problem of video semantic segmentation.
arXiv Detail & Related papers (2023-06-04T07:50:38Z)
The Runner-up Solution for YouTube-VIS Long Video Challenge 2022 [72.13080661144761]
We adopt the previously proposed online video instance segmentation method IDOL for this challenge. We use pseudo labels to further help contrastive learning, so as to obtain more temporally consistent instance embedding. The proposed method obtains 40.2 AP on the YouTube-VIS 2022 long video dataset and was ranked second in this challenge.
arXiv Detail & Related papers (2022-11-18T01:40:59Z)
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation [90.26723865198348]
We present PolyphonicFormer, a vision transformer to unify all the sub-tasks under the DVPS task. Our method explores the relationship between depth estimation and panoptic segmentation via query-based learning. Our method ranks 1st on the ICCV-2021 BMTT Challenge video + depth track.
arXiv Detail & Related papers (2021-12-05T14:31:47Z)
Video Panoptic Segmentation [117.08520543864054]
We propose and explore a new video extension of this task, called video panoptic segmentation. To invigorate research on this new task, we present two types of video panoptic datasets. We propose a novel video panoptic segmentation network (VPSNet) which jointly predicts object classes, bounding boxes, masks, instance id tracking, and semantic segmentation in video frames.
arXiv Detail & Related papers (2020-06-19T19:35:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.