Video Class Agnostic Segmentation with Contrastive Learning for
Autonomous Driving
- URL: http://arxiv.org/abs/2105.03533v2
- Date: Tue, 11 May 2021 02:40:17 GMT
- Title: Video Class Agnostic Segmentation with Contrastive Learning for
Autonomous Driving
- Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand
- Abstract summary: We propose a novel auxiliary contrastive loss to learn the segmentation of known classes and unknown objects.
Unlike previous work in contrastive learning that samples the anchor, positive and negative examples on an image level, our contrastive learning method leverages pixel-wise semantic and temporal guidance.
We release a large-scale synthetic dataset for different autonomous driving scenarios that includes distinct and rare unknown objects.
- Score: 13.312978643938202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation in autonomous driving predominantly focuses on learning
from large-scale data with a closed set of known classes without considering
unknown objects. Motivated by safety reasons, we address the video class
agnostic segmentation task, which considers unknown objects outside the closed
set of known classes in our training data. We propose a novel auxiliary
contrastive loss to learn the segmentation of known classes and unknown
objects. Unlike previous work in contrastive learning that samples the anchor,
positive and negative examples on an image level, our contrastive learning
method leverages pixel-wise semantic and temporal guidance. We conduct
experiments on Cityscapes-VPS by withholding four classes from training and
show an improvement gain for both known and unknown objects segmentation with
the auxiliary contrastive loss. We further release a large-scale synthetic
dataset for different autonomous driving scenarios that includes distinct and
rare unknown objects. We conduct experiments on the full synthetic dataset and
a reduced small-scale version, and show how contrastive learning is more
effective in small scale datasets. Our proposed models, dataset, and code will
be released at https://github.com/MSiam/video_class_agnostic_segmentation.
Related papers
- CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z) - Simplifying Open-Set Video Domain Adaptation with Contrastive Learning [16.72734794723157]
unsupervised video domain adaptation methods have been proposed to adapt a predictive model from a labelled dataset to an unlabelled dataset.
We address a more realistic scenario, called open-set video domain adaptation (OUVDA), where the target dataset contains "unknown" semantic categories that are not shared with the source.
We propose a video-oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data.
arXiv Detail & Related papers (2023-01-09T13:16:50Z) - Segmenting Known Objects and Unseen Unknowns without Prior Knowledge [86.46204148650328]
holistic segmentation aims to identify and separate objects of unseen, unknown categories into instances without any prior knowledge about them.
We tackle this new problem with U3HS, which finds unknowns as highly uncertain regions and clusters their corresponding instance-aware embeddings into individual objects.
Experiments on public data from MS, Cityscapes, and Lost&Found demonstrate the effectiveness of U3HS.
arXiv Detail & Related papers (2022-09-12T16:59:36Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - CoCon: Cooperative-Contrastive Learning [52.342936645996765]
Self-supervised visual representation learning is key for efficient video analysis.
Recent success in learning image representations suggests contrastive learning is a promising framework to tackle this challenge.
We introduce a cooperative variant of contrastive learning to utilize complementary information across views.
arXiv Detail & Related papers (2021-04-30T05:46:02Z) - Video Class Agnostic Segmentation Benchmark for Autonomous Driving [13.312978643938202]
In certain safety-critical robotics applications, it is important to segment all objects, including those unknown at training time.
We formalize the task of video class segmentation from monocular video sequences in autonomous driving to account for unknown objects.
arXiv Detail & Related papers (2021-03-19T20:41:40Z) - Fool Me Once: Robust Selective Segmentation via Out-of-Distribution
Detection with Contrastive Learning [27.705683228657175]
We train a network to simultaneously perform segmentation and pixel-wise Out-of-Distribution (OoD) detection.
This is made possible by leveraging an OoD dataset with a novel contrastive objective and data augmentation scheme.
We show that by selectively segmenting scenes based on what is predicted as OoD, we can increase the segmentation accuracy by an IoU of 0.2 with respect to alternative techniques.
arXiv Detail & Related papers (2021-03-01T09:38:40Z) - "What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences [27.915309216800125]
We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator.
We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge.
Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
arXiv Detail & Related papers (2020-11-06T10:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.