Monocular Instance Motion Segmentation for Autonomous Driving: KITTI
InstanceMotSeg Dataset and Multi-task Baseline
- URL: http://arxiv.org/abs/2008.07008v4
- Date: Wed, 26 May 2021 15:12:49 GMT
- Title: Monocular Instance Motion Segmentation for Autonomous Driving: KITTI
InstanceMotSeg Dataset and Multi-task Baseline
- Authors: Eslam Mohamed, Mahmoud Ewaisha, Mennatullah Siam, Hazem Rashed,
Senthil Yogamani, Waleed Hamdy, Muhammad Helmi and Ahmad El-Sallab
- Abstract summary: Moving object segmentation is a crucial task for autonomous vehicles as it can be used to segment objects in a class agnostic manner.
Although pixel-wise motion segmentation has been studied in autonomous driving literature, it has been rarely addressed at the instance level.
We create a new InstanceMotSeg dataset comprising of 12.9K samples improving upon our KITTIMoSeg dataset.
- Score: 5.000331633798637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moving object segmentation is a crucial task for autonomous vehicles as it
can be used to segment objects in a class agnostic manner based on their motion
cues. It enables the detection of unseen objects during training (e.g., moose
or a construction truck) based on their motion and independent of their
appearance. Although pixel-wise motion segmentation has been studied in
autonomous driving literature, it has been rarely addressed at the instance
level, which would help separate connected segments of moving objects leading
to better trajectory planning. As the main issue is the lack of large public
datasets, we create a new InstanceMotSeg dataset comprising of 12.9K samples
improving upon our KITTIMoSeg dataset. In addition to providing instance level
annotations, we have added 4 additional classes which is crucial for studying
class agnostic motion segmentation. We adapt YOLACT and implement a
motion-based class agnostic instance segmentation model which would act as a
baseline for the dataset. We also extend it to an efficient multi-task model
which additionally provides semantic instance segmentation sharing the encoder.
The model then learns separate prototype coefficients within the class agnostic
and semantic heads providing two independent paths of object detection for
redundant safety. To obtain real-time performance, we study different efficient
encoders and obtain 39 fps on a Titan Xp GPU using MobileNetV2 with an
improvement of 10% mAP relative to the baseline. Our model improves the
previous state of the art motion segmentation method by 3.3%. The dataset and
qualitative results video are shared in our website at
https://sites.google.com/view/instancemotseg/.
Related papers
- Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.
In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.
These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z) - OMG-Seg: Is One Model Good Enough For All Segmentation? [83.17068644513144]
OMG-Seg is a transformer-based encoder-decoder architecture with task-specific queries and outputs.
We show that OMG-Seg can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead.
arXiv Detail & Related papers (2024-01-18T18:59:34Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - Segmenting Moving Objects via an Object-Centric Layered Representation [100.26138772664811]
We introduce an object-centric segmentation model with a depth-ordered layer representation.
We introduce a scalable pipeline for generating synthetic training data with multiple objects.
We evaluate the model on standard video segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T17:59:43Z) - Video Class Agnostic Segmentation Benchmark for Autonomous Driving [13.312978643938202]
In certain safety-critical robotics applications, it is important to segment all objects, including those unknown at training time.
We formalize the task of video class segmentation from monocular video sequences in autonomous driving to account for unknown objects.
arXiv Detail & Related papers (2021-03-19T20:41:40Z) - INSTA-YOLO: Real-Time Instance Segmentation [2.726684740197893]
We propose Insta-YOLO, a novel one-stage end-to-end deep learning model for real-time instance segmentation.
The proposed model is inspired by the YOLO one-shot object detector, with the box regression loss is replaced with regression in the localization head.
We evaluate our model on three datasets, namely, Carnva, Cityscapes and Airbus.
arXiv Detail & Related papers (2021-02-12T21:17:29Z) - Self-supervised Sparse to Dense Motion Segmentation [13.888344214818737]
We propose a self supervised method to learn the densification of sparse motion segmentations from single video frames.
We evaluate our method on the well-known motion segmentation datasets FBMS59 and DAVIS16.
arXiv Detail & Related papers (2020-08-18T11:40:18Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.