Weakly Supervised Instance Segmentation using Motion Information via
Optical Flow
- URL: http://arxiv.org/abs/2202.13006v1
- Date: Fri, 25 Feb 2022 22:41:54 GMT
- Title: Weakly Supervised Instance Segmentation using Motion Information via
Optical Flow
- Authors: Jun Ikeda and Junichiro Mori
- Abstract summary: We propose a two-stream encoder that leverages appearance and motion features extracted from images and optical flows.
Our results demonstrate that the proposed method improves the Average Precision of the state-of-the-art method by 3.1.
- Score: 3.0763099528432263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised instance segmentation has gained popularity because it
reduces high annotation cost of pixel-level masks required for model training.
Recent approaches for weakly supervised instance segmentation detect and
segment objects using appearance information obtained from a static image.
However, it poses the challenge of identifying objects with a
non-discriminatory appearance. In this study, we address this problem by using
motion information from image sequences. We propose a two-stream encoder that
leverages appearance and motion features extracted from images and optical
flows. Additionally, we propose a novel pairwise loss that considers both
appearance and motion information to supervise segmentation. We conducted
extensive evaluations on the YouTube-VIS 2019 benchmark dataset. Our results
demonstrate that the proposed method improves the Average Precision of the
state-of-the-art method by 3.1.
Related papers
- Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and
Bootstrapped Self-training [13.985488693082981]
We propose a self-supervised object discovery approach that leverages motion and appearance information to produce high-quality object segmentation masks.
We demonstrate the effectiveness of our approach, named LOCATE, on multiple standard video object segmentation, image saliency detection, and object segmentation benchmarks.
arXiv Detail & Related papers (2023-08-22T07:27:09Z) - Efficient Unsupervised Video Object Segmentation Network Based on Motion
Guidance [1.5736899098702974]
This paper proposes a video object segmentation network based on motion guidance.
The model comprises a dual-stream network, motion guidance module, and multi-scale progressive fusion module.
The experimental results prove the superior performance of the proposed method.
arXiv Detail & Related papers (2022-11-10T06:13:23Z) - Motion-inductive Self-supervised Object Discovery in Videos [99.35664705038728]
We propose a model for processing consecutive RGB frames, and infer the optical flow between any pair of frames using a layered representation.
We demonstrate superior performance over previous state-of-the-art methods on three public video segmentation datasets.
arXiv Detail & Related papers (2022-10-01T08:38:28Z) - Segmenting Moving Objects via an Object-Centric Layered Representation [100.26138772664811]
We introduce an object-centric segmentation model with a depth-ordered layer representation.
We introduce a scalable pipeline for generating synthetic training data with multiple objects.
We evaluate the model on standard video segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T17:59:43Z) - Weakly Supervised Video Salient Object Detection [79.51227350937721]
We present the first weakly supervised video salient object detection model based on relabeled "fixation guided scribble annotations"
An "Appearance-motion fusion module" and bidirectional ConvLSTM based framework are proposed to achieve effective multi-modal learning and long-term temporal context modeling.
arXiv Detail & Related papers (2021-04-06T09:48:38Z) - Weakly Supervised Instance Segmentation for Videos with Temporal Mask
Consistency [28.352140544936198]
Weakly supervised instance segmentation reduces the cost of annotations required to train models.
We show that these issues can be better addressed by training with weakly labeled videos instead of images.
We are the first to explore the use of these video signals to tackle weakly supervised instance segmentation.
arXiv Detail & Related papers (2021-03-23T23:20:46Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - Motion-Attentive Transition for Zero-Shot Video Object Segmentation [99.44383412488703]
We present a Motion-Attentive Transition Network (MATNet) for zero-shot object segmentation.
An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder.
In this way, the encoder becomes deeply internative, allowing for closely hierarchical interactions between object motion and appearance.
arXiv Detail & Related papers (2020-03-09T16:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.