Treating Motion as Option with Output Selection for Unsupervised Video
Object Segmentation
- URL: http://arxiv.org/abs/2309.14786v1
- Date: Tue, 26 Sep 2023 09:34:13 GMT
- Title: Treating Motion as Option with Output Selection for Unsupervised Video
Object Segmentation
- Authors: Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Sangyoun Lee
- Abstract summary: Video object segmentation (VOS) aims to detect the most salient object in a video without external guidance about the object.
Recent methods collaboratively use motion cues extracted from optical flow maps with appearance cues extracted from RGB images.
We propose a novel motion-as-option network by treating motion cues as optional.
- Score: 17.71871884366252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised video object segmentation (VOS) is a task that aims to detect
the most salient object in a video without external guidance about the object.
To leverage the property that salient objects usually have distinctive
movements compared to the background, recent methods collaboratively use motion
cues extracted from optical flow maps with appearance cues extracted from RGB
images. However, as optical flow maps are usually very relevant to segmentation
masks, the network is easy to be learned overly dependent on the motion cues
during network training. As a result, such two-stream approaches are vulnerable
to confusing motion cues, making their prediction unstable. To relieve this
issue, we design a novel motion-as-option network by treating motion cues as
optional. During network training, RGB images are randomly provided to the
motion encoder instead of optical flow maps, to implicitly reduce motion
dependency of the network. As the learned motion encoder can deal with both RGB
images and optical flow maps, two different predictions can be generated
depending on which source information is used as motion input. In order to
fully exploit this property, we also propose an adaptive output selection
algorithm to adopt optimal prediction result at test time. Our proposed
approach affords state-of-the-art performance on all public benchmark datasets,
even maintaining real-time inference speed.
Related papers
- Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving.
Most segmentation methods leverage motion cues obtained from optical flow maps.
We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z) - Adaptive Multi-source Predictor for Zero-shot Video Object Segmentation [68.56443382421878]
We propose a novel adaptive multi-source predictor for zero-shot video object segmentation (ZVOS)
In the static object predictor, the RGB source is converted to depth and static saliency sources, simultaneously.
Experiments show that the proposed model outperforms the state-of-the-art methods on three challenging ZVOS benchmarks.
arXiv Detail & Related papers (2023-03-18T10:19:29Z) - Treating Motion as Option to Reduce Motion Dependency in Unsupervised
Video Object Segmentation [5.231219025536678]
Unsupervised video object segmentation (VOS) aims to detect the most salient object in a video sequence at the pixel level.
Most state-of-the-art methods leverage motion cues obtained from optical flow maps in addition to appearance cues to exploit the property that salient objects usually have distinctive movements compared to the background.
arXiv Detail & Related papers (2022-09-04T18:05:52Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Motion-Attentive Transition for Zero-Shot Video Object Segmentation [99.44383412488703]
We present a Motion-Attentive Transition Network (MATNet) for zero-shot object segmentation.
An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder.
In this way, the encoder becomes deeply internative, allowing for closely hierarchical interactions between object motion and appearance.
arXiv Detail & Related papers (2020-03-09T16:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.