Video Class Agnostic Segmentation Benchmark for Autonomous Driving
- URL: http://arxiv.org/abs/2103.11015v1
- Date: Fri, 19 Mar 2021 20:41:40 GMT
- Title: Video Class Agnostic Segmentation Benchmark for Autonomous Driving
- Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand
- Abstract summary: In certain safety-critical robotics applications, it is important to segment all objects, including those unknown at training time.
We formalize the task of video class segmentation from monocular video sequences in autonomous driving to account for unknown objects.
- Score: 13.312978643938202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic segmentation approaches are typically trained on large-scale data
with a closed finite set of known classes without considering unknown objects.
In certain safety-critical robotics applications, especially autonomous
driving, it is important to segment all objects, including those unknown at
training time. We formalize the task of video class agnostic segmentation from
monocular video sequences in autonomous driving to account for unknown objects.
Video class agnostic segmentation can be formulated as an open-set or a motion
segmentation problem. We discuss both formulations and provide datasets and
benchmark different baseline approaches for both tracks. In the
motion-segmentation track we benchmark real-time joint panoptic and motion
instance segmentation, and evaluate the effect of ego-flow suppression. In the
open-set segmentation track we evaluate baseline methods that combine
appearance, and geometry to learn prototypes per semantic class. We then
compare it to a model that uses an auxiliary contrastive loss to improve the
discrimination between known and unknown objects. All datasets and models are
publicly released at https://msiam.github.io/vca/.
Related papers
- Lidar Panoptic Segmentation in an Open World [50.094491113541046]
Lidar Panoptics (LPS) is crucial for safe deployment of autonomous vehicles.
LPS aims to recognize and segment lidar points wr.t. a pre-defined vocabulary of semantic classes.
We propose a class-agnostic point clustering and over-segment the input cloud in a hierarchical fashion, followed by binary point segment classification.
arXiv Detail & Related papers (2024-09-22T00:10:20Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Tracking Anything with Decoupled Video Segmentation [87.07258378407289]
We develop a decoupled video segmentation approach (DEVA)
It is composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation.
We show that this decoupled formulation compares favorably to end-to-end approaches in several data-scarce tasks.
arXiv Detail & Related papers (2023-09-07T17:59:41Z) - Segment Anything Meets Point Tracking [116.44931239508578]
This paper presents a novel method for point-centric interactive video segmentation, empowered by SAM and long-term point tracking.
We highlight the merits of point-based tracking through direct evaluation on the zero-shot open-world Unidentified Video Objects (UVO) benchmark.
Our experiments on popular video object segmentation and multi-object segmentation tracking benchmarks, including DAVIS, YouTube-VOS, and BDD100K, suggest that a point-based segmentation tracker yields better zero-shot performance and efficient interactions.
arXiv Detail & Related papers (2023-07-03T17:58:01Z) - Video Class Agnostic Segmentation with Contrastive Learning for
Autonomous Driving [13.312978643938202]
We propose a novel auxiliary contrastive loss to learn the segmentation of known classes and unknown objects.
Unlike previous work in contrastive learning that samples the anchor, positive and negative examples on an image level, our contrastive learning method leverages pixel-wise semantic and temporal guidance.
We release a large-scale synthetic dataset for different autonomous driving scenarios that includes distinct and rare unknown objects.
arXiv Detail & Related papers (2021-05-07T23:07:06Z) - SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes.
They are ill-equipped to handle previously-unseen objects.
detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z) - Highway Driving Dataset for Semantic Video Segmentation [31.198877342304876]
We introduce the semantic video dataset, the Highway Driving dataset, which is a benchmark for a semantic video segmentation task.
We propose a baseline algorithm that utilizes a temporal correlation.
Together with our attempt to analyze the temporal correlation, we expect the Highway Driving dataset to encourage research on semantic video segmentation.
arXiv Detail & Related papers (2020-11-02T01:50:52Z) - Self-supervised Sparse to Dense Motion Segmentation [13.888344214818737]
We propose a self supervised method to learn the densification of sparse motion segmentations from single video frames.
We evaluate our method on the well-known motion segmentation datasets FBMS59 and DAVIS16.
arXiv Detail & Related papers (2020-08-18T11:40:18Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - Monocular Instance Motion Segmentation for Autonomous Driving: KITTI
InstanceMotSeg Dataset and Multi-task Baseline [5.000331633798637]
Moving object segmentation is a crucial task for autonomous vehicles as it can be used to segment objects in a class agnostic manner.
Although pixel-wise motion segmentation has been studied in autonomous driving literature, it has been rarely addressed at the instance level.
We create a new InstanceMotSeg dataset comprising of 12.9K samples improving upon our KITTIMoSeg dataset.
arXiv Detail & Related papers (2020-08-16T21:47:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.