Treating Motion as Option to Reduce Motion Dependency in Unsupervised
Video Object Segmentation
- URL: http://arxiv.org/abs/2209.03138v1
- Date: Sun, 4 Sep 2022 18:05:52 GMT
- Title: Treating Motion as Option to Reduce Motion Dependency in Unsupervised
Video Object Segmentation
- Authors: Suhwan Cho, Minhyeok Lee, Seunghoon Lee, Chaewon Park, Donghyeong Kim,
Sangyoun Lee
- Abstract summary: Unsupervised video object segmentation (VOS) aims to detect the most salient object in a video sequence at the pixel level.
Most state-of-the-art methods leverage motion cues obtained from optical flow maps in addition to appearance cues to exploit the property that salient objects usually have distinctive movements compared to the background.
- Score: 5.231219025536678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised video object segmentation (VOS) aims to detect the most salient
object in a video sequence at the pixel level. In unsupervised VOS, most
state-of-the-art methods leverage motion cues obtained from optical flow maps
in addition to appearance cues to exploit the property that salient objects
usually have distinctive movements compared to the background. However, as they
are overly dependent on motion cues, which may be unreliable in some cases,
they cannot achieve stable prediction. To reduce this motion dependency of
existing two-stream VOS methods, we propose a novel motion-as-option network
that optionally utilizes motion cues. Additionally, to fully exploit the
property of the proposed network that motion is not always required, we
introduce a collaborative network learning strategy. On all the public
benchmark datasets, our proposed network affords state-of-the-art performance
with real-time inference speed.
Related papers
- Traffic Video Object Detection using Motion Prior [16.63738085066699]
We propose two innovative methods to exploit the motion prior and boost the performance of traffic video object detection.
Firstly, we introduce a new self-attention module that leverages the motion prior to guide temporal information integration.
Secondly, we utilise a pseudo-labelling mechanism to eliminate noisy pseudo labels for the semi-supervised setting.
arXiv Detail & Related papers (2023-11-16T18:59:46Z) - Treating Motion as Option with Output Selection for Unsupervised Video
Object Segmentation [17.71871884366252]
Video object segmentation (VOS) aims to detect the most salient object in a video without external guidance about the object.
Recent methods collaboratively use motion cues extracted from optical flow maps with appearance cues extracted from RGB images.
We propose a novel motion-as-option network by treating motion cues as optional.
arXiv Detail & Related papers (2023-09-26T09:34:13Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - InstMove: Instance Motion for Object-centric Video Segmentation [70.16915119724757]
In this work, we study the instance-level motion and present InstMove, which stands for Instance Motion for Object-centric Video.
In comparison to pixel-wise motion, InstMove mainly relies on instance-level motion information that is free from image feature embeddings.
With only a few lines of code, InstMove can be integrated into current SOTA methods for three different video segmentation tasks.
arXiv Detail & Related papers (2023-03-14T17:58:44Z) - Unsupervised Multi-object Segmentation by Predicting Probable Motion
Patterns [92.80981308407098]
We propose a new approach to learn to segment multiple image objects without manual supervision.
The method can extract objects form still images, but uses videos for supervision.
We show state-of-the-art unsupervised object segmentation performance on simulated and real-world benchmarks.
arXiv Detail & Related papers (2022-10-21T17:57:05Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - Deep Motion Prior for Weakly-Supervised Temporal Action Localization [35.25323276744999]
Weakly-Supervised Temporal Action localization (WSTAL) aims to localize actions in untrimmed videos with only video-level labels.
Currently, most state-of-the-art WSTAL methods follow a Multi-Instance Learning (MIL) pipeline.
We argue that existing methods have overlooked two important drawbacks: 1) inadequate use of motion information and 2) the incompatibility of prevailing cross-entropy training loss.
arXiv Detail & Related papers (2021-08-12T08:51:36Z) - Full-Duplex Strategy for Video Object Segmentation [141.43983376262815]
Full- Strategy Network (FSNet) is a novel framework for video object segmentation (VOS)
Our FSNet performs the crossmodal feature-passing (i.e., transmission and receiving) simultaneously before fusion decoding stage.
We show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.
arXiv Detail & Related papers (2021-08-06T14:50:50Z) - Motion-Attentive Transition for Zero-Shot Video Object Segmentation [99.44383412488703]
We present a Motion-Attentive Transition Network (MATNet) for zero-shot object segmentation.
An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder.
In this way, the encoder becomes deeply internative, allowing for closely hierarchical interactions between object motion and appearance.
arXiv Detail & Related papers (2020-03-09T16:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.