Deep Metric Learning for Open World Semantic Segmentation
- URL: http://arxiv.org/abs/2108.04562v1
- Date: Tue, 10 Aug 2021 10:15:57 GMT
- Title: Deep Metric Learning for Open World Semantic Segmentation
- Authors: Jun Cen, Peng Yun, Junhao Cai, Michael Yu Wang, Ming Liu
- Abstract summary: Close-set semantic segmentation networks have limited ability to detect out-of-distribution (OOD) objects.
We propose an open world semantic segmentation system that includes two modules.
We adopt the Deep Metric Learning Network (DMLNet) with contrastive clustering to implement open-set semantic segmentation.
- Score: 12.617115020561789
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Classical close-set semantic segmentation networks have limited ability to
detect out-of-distribution (OOD) objects, which is important for
safety-critical applications such as autonomous driving. Incrementally learning
these OOD objects with few annotations is an ideal way to enlarge the knowledge
base of the deep learning models. In this paper, we propose an open world
semantic segmentation system that includes two modules: (1) an open-set
semantic segmentation module to detect both in-distribution and OOD objects.
(2) an incremental few-shot learning module to gradually incorporate those OOD
objects into its existing knowledge base. This open world semantic segmentation
system behaves like a human being, which is able to identify OOD objects and
gradually learn them with corresponding supervision. We adopt the Deep Metric
Learning Network (DMLNet) with contrastive clustering to implement open-set
semantic segmentation. Compared to other open-set semantic segmentation
methods, our DMLNet achieves state-of-the-art performance on three challenging
open-set semantic segmentation datasets without using additional data or
generative models. On this basis, two incremental few-shot learning methods are
further proposed to progressively improve the DMLNet with the annotations of
OOD objects.
Related papers
- LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection [6.813145466843275]
LiSD is a voxel-based encoder-decoder framework that addresses both segmentation and detection tasks.
It achieves the state-of-the-art performance of 83.3% mIoU on the nuScenes segmentation benchmark for lidar-only methods.
arXiv Detail & Related papers (2024-06-11T07:26:54Z) - Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations [4.807052027638089]
We present the Neural Slot Interpreter (NSI) that learns to ground and generate object semantics via slot representations.
NSI is an XML-like programming language that uses simple syntax rules to organize the object semantics of a scene into object-centric program primitives.
arXiv Detail & Related papers (2024-02-02T12:37:23Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Open-world Semantic Segmentation for LIDAR Point Clouds [18.45831801175225]
We propose an open-world semantic segmentation task for LIDAR point clouds.
It aims to identify both old and novel classes using open-set semantic segmentation.
It also gradually incorporate novel objects into the existing knowledge base using incremental learning.
arXiv Detail & Related papers (2022-07-04T14:40:35Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.