Collaborative Video Object Segmentation by Foreground-Background
Integration
- URL: http://arxiv.org/abs/2003.08333v2
- Date: Thu, 23 Jul 2020 11:31:22 GMT
- Title: Collaborative Video Object Segmentation by Foreground-Background
Integration
- Authors: Zongxin Yang, Yunchao Wei, Yi Yang
- Abstract summary: We propose Collaborative video object segmentation by Foreground-Background Integration (CFBI) approach.
Our CFBI implicitly imposes the feature embedding from the target foreground object and its corresponding background to be contrastive, promoting the segmentation results accordingly.
Our CFBI achieves the performance (J$F) of 89.4%, 81.9%, and 81.4%, respectively, outperforming all the other state-of-the-art methods.
- Score: 77.71512243438329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the principles of embedding learning to tackle the
challenging semi-supervised video object segmentation. Different from previous
practices that only explore the embedding learning using pixels from foreground
object (s), we consider background should be equally treated and thus propose
Collaborative video object segmentation by Foreground-Background Integration
(CFBI) approach. Our CFBI implicitly imposes the feature embedding from the
target foreground object and its corresponding background to be contrastive,
promoting the segmentation results accordingly. With the feature embedding from
both foreground and background, our CFBI performs the matching process between
the reference and the predicted sequence from both pixel and instance levels,
making the CFBI be robust to various object scales. We conduct extensive
experiments on three popular benchmarks, i.e., DAVIS 2016, DAVIS 2017, and
YouTube-VOS. Our CFBI achieves the performance (J$F) of 89.4%, 81.9%, and
81.4%, respectively, outperforming all the other state-of-the-art methods.
Code: https://github.com/z-x-yang/CFBI.
Related papers
- 3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation [63.199793919573295]
Video Object (VOS) is a vital task in computer vision, focusing on distinguishing foreground objects from the background across video frames.
Our work draws inspiration from the Cutie model, and we investigate the effects of object memory, the total number of memory frames, and input resolution on segmentation performance.
arXiv Detail & Related papers (2024-06-06T00:56:25Z) - Look Before You Match: Instance Understanding Matters in Video Object
Segmentation [114.57723592870097]
In this paper, we argue that instance matters in video object segmentation (VOS)
We present a two-branch network for VOS, where the query-based instance segmentation (IS) branch delves into the instance details of the current frame and the VOS branch performs spatial-temporal matching with the memory bank.
We employ well-learned object queries from IS branch to inject instance-specific information into the query key, with which the instance-auged matching is further performed.
arXiv Detail & Related papers (2022-12-13T18:59:59Z) - Collaborative Attention Memory Network for Video Object Segmentation [3.8520227078236013]
We propose Collaborative Attention Memory Network with an enhanced segmentation head.
We also propose an ensemble network to combine STM network with all these new refined CFBI network.
Finally, we evaluate our approach on the 2021 Youtube-VOS challenge where we obtain 6th place with an overall score of 83.5%.
arXiv Detail & Related papers (2022-05-17T03:40:11Z) - Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene
Segmentation [58.74791043631219]
We propose a novel framework STswinCL that explores the complementary intra- and inter-video relations to boost segmentation performance.
We extensively validate our approach on two public surgical video benchmarks, including EndoVis18 Challenge and CaDIS dataset.
Experimental results demonstrate the promising performance of our method, which consistently exceeds previous state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-29T05:52:23Z) - Learning Position and Target Consistency for Memory-based Video Object
Segmentation [39.787966275016906]
Learn position and target consistency framework for memory-based video object segmentation.
It applies the memory mechanism to retrieve pixels globally, and meanwhile learns position consistency for more reliable segmentation.
Experiments show that our LCM achieves state-of-the-art performance on both DAVIS and Youtube-VOS benchmark.
arXiv Detail & Related papers (2021-04-09T12:22:37Z) - Collaborative Video Object Segmentation by Multi-Scale
Foreground-Background Integration [77.71512243438329]
We propose a Collaborative video object segmentation by Foreground-Background Integration (CFBI) approach.
CFBI separates the feature embedding into the foreground object region and its corresponding background region, implicitly promoting them to be more contrastive and improving the segmentation results accordingly.
Based on CFBI, we introduce a multi-scale matching structure and propose an Atrous Matching strategy, resulting in a more robust and efficient framework, CFBI+.
arXiv Detail & Related papers (2020-10-13T13:06:10Z) - Learning Discriminative Feature with CRF for Unsupervised Video Object
Segmentation [34.1031534327244]
We introduce discriminative feature network (DFNet) to address the unsupervised video object segmentation task.
DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4%.
DFNet is also applied to the image object co-segmentation task.
arXiv Detail & Related papers (2020-08-04T01:53:56Z) - Learning What to Learn for Video Object Segmentation [157.4154825304324]
We introduce an end-to-end trainable VOS architecture that integrates a differentiable few-shot learning module.
This internal learner is designed to predict a powerful parametric model of the target.
We set a new state-of-the-art on the large-scale YouTube-VOS 2018 dataset by achieving an overall score of 81.5.
arXiv Detail & Related papers (2020-03-25T17:58:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.