Weakly supervised training of universal visual concepts for multi-domain
semantic segmentation
- URL: http://arxiv.org/abs/2212.10340v3
- Date: Tue, 12 Mar 2024 09:53:46 GMT
- Title: Weakly supervised training of universal visual concepts for multi-domain
semantic segmentation
- Authors: Petra Bevandi\'c, Marin Or\v{s}i\'c, Ivan Grubi\v{s}i\'c, Josip
\v{S}ari\'c, Sini\v{s}a \v{S}egvi\'c
- Abstract summary: Deep supervised models have an unprecedented capacity to absorb large quantities of training data.
Different datasets often have incompatible labels. We consider labels as unions of universal visual concepts.
Our method achieves competitive within-dataset and cross-dataset generalization, as well as ability to learn visual concepts which are not separately labeled in any of the training datasets.
- Score: 1.772589329365753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep supervised models have an unprecedented capacity to absorb large
quantities of training data. Hence, training on multiple datasets becomes a
method of choice towards strong generalization in usual scenes and graceful
performance degradation in edge cases. Unfortunately, different datasets often
have incompatible labels. For instance, the Cityscapes road class subsumes all
driving surfaces, while Vistas defines separate classes for road markings,
manholes etc. Furthermore, many datasets have overlapping labels. For instance,
pickups are labeled as trucks in VIPER, cars in Vistas, and vans in ADE20k. We
address this challenge by considering labels as unions of universal visual
concepts. This allows seamless and principled learning on multi-domain dataset
collections without requiring any relabeling effort. Our method achieves
competitive within-dataset and cross-dataset generalization, as well as ability
to learn visual concepts which are not separately labeled in any of the
training datasets. Experiments reveal competitive or state-of-the-art
performance on two multi-domain dataset collections and on the WildDash 2
benchmark.
Related papers
- Reliable Representations Learning for Incomplete Multi-View Partial Multi-Label Classification [78.15629210659516]
In this paper, we propose an incomplete multi-view partial multi-label classification network named RANK.
We break through the view-level weights inherent in existing methods and propose a quality-aware sub-network to dynamically assign quality scores to each view of each sample.
Our model is not only able to handle complete multi-view multi-label datasets, but also works on datasets with missing instances and labels.
arXiv Detail & Related papers (2023-03-30T03:09:25Z) - Label Name is Mantra: Unifying Point Cloud Segmentation across
Heterogeneous Datasets [17.503843467554592]
We propose a principled approach that supports learning from heterogeneous datasets with different label sets.
Our idea is to utilize a pre-trained language model to embed discrete labels to a continuous latent space with the help of their label names.
Our model outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2023-03-19T06:14:22Z) - Automatic universal taxonomies for multi-domain semantic segmentation [1.4364491422470593]
Training semantic segmentation models on multiple datasets has sparked a lot of recent interest in the computer vision community.
established datasets have mutually incompatible labels which disrupt principled inference in the wild.
We address this issue by automatic construction of universal through iterative dataset integration.
arXiv Detail & Related papers (2022-07-18T08:53:17Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - Multi-domain semantic segmentation with overlapping labels [1.4120796122384087]
We propose a principled method for seamless learning on datasets with overlapping classes based on partial labels and probabilistic loss.
Our method achieves competitive within-dataset and cross-dataset generalization, as well as ability to learn visual concepts which are not separately labeled in any of the training datasets.
arXiv Detail & Related papers (2021-08-25T13:25:41Z) - DAIL: Dataset-Aware and Invariant Learning for Face Recognition [67.4903809903022]
To achieve good performance in face recognition, a large scale training dataset is usually required.
It is problematic and troublesome to naively combine different datasets due to two major issues.
Naively treating the same person as different classes in different datasets during training will affect back-propagation.
manually cleaning labels may take formidable human efforts, especially when there are millions of images and thousands of identities.
arXiv Detail & Related papers (2021-01-14T01:59:52Z) - Reducing the Annotation Effort for Video Object Segmentation Datasets [50.893073670389164]
densely labeling every frame with pixel masks does not scale to large datasets.
We use a deep convolutional network to automatically create pseudo-labels on a pixel level from much cheaper bounding box annotations.
We obtain the new TAO-VOS benchmark, which we make publicly available at www.vision.rwth-aachen.de/page/taovos.
arXiv Detail & Related papers (2020-11-02T17:34:45Z) - Multi-label Zero-shot Classification by Learning to Transfer from
External Knowledge [36.04579549557464]
Multi-label zero-shot classification aims to predict multiple unseen class labels for an input image.
This paper introduces a novel multi-label zero-shot classification framework by learning to transfer from external knowledge.
arXiv Detail & Related papers (2020-07-30T17:26:46Z) - Cross-dataset Training for Class Increasing Object Detection [52.34737978720484]
We present a conceptually simple, flexible and general framework for cross-dataset training in object detection.
By cross-dataset training, existing datasets can be utilized to detect the merged object classes with a single model.
While using cross-dataset training, we only need to label the new classes on the new dataset.
arXiv Detail & Related papers (2020-01-14T04:40:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.