Revisiting Click-based Interactive Video Object Segmentation
- URL: http://arxiv.org/abs/2203.01784v1
- Date: Thu, 3 Mar 2022 15:55:14 GMT
- Title: Revisiting Click-based Interactive Video Object Segmentation
- Authors: Stephane Vujasinovic, Sebastian Bullinger, Stefan Becker, Norbert
Scherer-Negenborn, Michael Arens and Rainer Stiefelhagen
- Abstract summary: CiVOS builds on de-coupled modules reflecting user interaction and mask propagation.
The approach is extensively evaluated on the popular interactiveDAVIS dataset.
The presented CiVOS pipeline achieves competitive results, although requiring a lower user workload.
- Score: 24.114405100879278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While current methods for interactive Video Object Segmentation (iVOS) rely
on scribble-based interactions to generate precise object masks, we propose a
Click-based interactive Video Object Segmentation (CiVOS) framework to simplify
the required user workload as much as possible. CiVOS builds on de-coupled
modules reflecting user interaction and mask propagation. The interaction
module converts click-based interactions into an object mask, which is then
inferred to the remaining frames by the propagation module. Additional user
interactions allow for a refinement of the object mask. The approach is
extensively evaluated on the popular interactive~DAVIS dataset, but with an
inevitable adaptation of scribble-based interactions with click-based
counterparts. We consider several strategies for generating clicks during our
evaluation to reflect various user inputs and adjust the DAVIS performance
metric to perform a hardware-independent comparison. The presented CiVOS
pipeline achieves competitive results, although requiring a lower user
workload.
Related papers
- Learning from Exemplars for Interactive Image Segmentation [15.37506525730218]
We introduce novel interactive segmentation frameworks for both a single object and multiple objects in the same category.
Our model reduces users' labor by around 15%, requiring two fewer clicks to achieve target IoUs 85% and 90%.
arXiv Detail & Related papers (2024-06-17T12:38:01Z) - Training-Free Robust Interactive Video Object Segmentation [82.05906654403684]
We propose a training-free prompt tracking framework for interactive video object segmentation (I-PT)
We jointly adopt sparse points and boxes tracking, filtering out unstable points and capturing object-wise information.
Our framework has demonstrated robust zero-shot video segmentation results on popular VOS datasets.
arXiv Detail & Related papers (2024-06-08T14:25:57Z) - FocSAM: Delving Deeply into Focused Objects in Segmenting Anything [58.042354516491024]
The Segment Anything Model (SAM) marks a notable milestone in segmentation models.
We propose FocSAM with a pipeline redesigned on two pivotal aspects.
First, we propose Dynamic Window Multi-head Self-Attention (Dwin-MSA) to dynamically refocus SAM's image embeddings on the target object.
Second, we propose Pixel-wise Dynamic ReLU (P-DyReLU) to enable sufficient integration of interactive information from a few initial clicks.
arXiv Detail & Related papers (2024-05-29T02:34:13Z) - Explore Synergistic Interaction Across Frames for Interactive Video
Object Segmentation [70.93295323156876]
We propose a framework that can accept multiple frames simultaneously and explore synergistic interaction across frames (SIAF)
Our SwinB-SIAF achieves new state-of-the-art performance on DAVIS 2017 (89.6%, J&F@60)
Our R50-SIAF is more than 3 faster than the state-of-the-art competitor under challenging multi-object scenarios.
arXiv Detail & Related papers (2024-01-23T04:19:15Z) - DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive
Segmentation Transformer [58.95404214273222]
Most state-of-the-art instance segmentation methods rely on large amounts of pixel-precise ground-truth for training.
We introduce a more efficient approach, called DynaMITe, in which we represent user interactions as-temporal queries.
Our architecture also alleviates any need to re-compute image features during refinement, and requires fewer interactions for segmenting multiple instances in a single image.
arXiv Detail & Related papers (2023-04-13T16:57:02Z) - InterFormer: Real-time Interactive Image Segmentation [80.45763765116175]
Interactive image segmentation enables annotators to efficiently perform pixel-level annotation for segmentation tasks.
The existing interactive segmentation pipeline suffers from inefficient computations of interactive models.
We propose a method named InterFormer that follows a new pipeline to address these issues.
arXiv Detail & Related papers (2023-04-06T08:57:00Z) - Contour-based Interactive Segmentation [4.164728134421114]
We consider a natural form of user interaction as a loose contour, and introduce a contour-based interactive segmentation method.
We demonstrate that a single contour provides the same accuracy as multiple clicks, thus reducing the required amount of user interactions.
arXiv Detail & Related papers (2023-02-13T13:35:26Z) - Modular Interactive Video Object Segmentation: Interaction-to-Mask,
Propagation and Difference-Aware Fusion [68.45737688496654]
We present a modular interactive VOS framework which decouples interaction-to-mask and mask propagation.
We show that our method outperforms current state-of-the-art algorithms while requiring fewer frame interactions.
arXiv Detail & Related papers (2021-03-14T14:39:08Z) - Localized Interactive Instance Segmentation [24.55415554455844]
We propose a clicking scheme wherein user interactions are restricted to the proximity of the object.
We demonstrate the effectiveness of our proposed clicking scheme and localization strategy through detailed experimentation.
arXiv Detail & Related papers (2020-10-18T23:24:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.