DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
- URL: http://arxiv.org/abs/2306.01736v1
- Date: Fri, 2 Jun 2023 17:59:24 GMT
- Title: DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
- Authors: Xiuye Gu, Yin Cui, Jonathan Huang, Abdullah Rashwan, Xuan Yang, Xingyi
Zhou, Golnaz Ghiasi, Weicheng Kuo, Huizhong Chen, Liang-Chieh Chen, David A
Ross
- Abstract summary: We propose to train a universal multi-dataset multi-task segmentation model: DaTaSeg.
We use a shared representation (mask proposals with class predictions) for all tasks.
We also leverage weak-supervision, allowing our segmentation model to benefit from cheaper bounding box annotations.
- Score: 42.49953563682122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Observing the close relationship among panoptic, semantic and instance
segmentation tasks, we propose to train a universal multi-dataset multi-task
segmentation model: DaTaSeg.We use a shared representation (mask proposals with
class predictions) for all tasks. To tackle task discrepancy, we adopt
different merge operations and post-processing for different tasks. We also
leverage weak-supervision, allowing our segmentation model to benefit from
cheaper bounding box annotations. To share knowledge across datasets, we use
text embeddings from the same semantic embedding space as classifiers and share
all network parameters among datasets. We train DaTaSeg on ADE semantic, COCO
panoptic, and Objects365 detection datasets. DaTaSeg improves performance on
all datasets, especially small-scale datasets, achieving 54.0 mIoU on ADE
semantic and 53.5 PQ on COCO panoptic. DaTaSeg also enables weakly-supervised
knowledge transfer on ADE panoptic and Objects365 instance segmentation.
Experiments show DaTaSeg scales with the number of training datasets and
enables open-vocabulary segmentation through direct transfer. In addition, we
annotate an Objects365 instance segmentation set of 1,000 images and will
release it as a public benchmark.
Related papers
- OMG-Seg: Is One Model Good Enough For All Segmentation? [83.17068644513144]
OMG-Seg is a transformer-based encoder-decoder architecture with task-specific queries and outputs.
We show that OMG-Seg can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead.
arXiv Detail & Related papers (2024-01-18T18:59:34Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - ScaleDet: A Scalable Multi-Dataset Object Detector [40.08148347029028]
We propose a scalable multi-dataset detector (ScaleDet) that can scale up its generalization across datasets.
Our results show that ScaleDet achieves compelling strong model performance with an mAP of 50.7 on LVIS, 58.8 on COCO, 46.8 on Objects365, 76.2 on OpenImages, and 71.8 on ODinW.
arXiv Detail & Related papers (2023-06-08T00:57:09Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - Label Name is Mantra: Unifying Point Cloud Segmentation across
Heterogeneous Datasets [17.503843467554592]
We propose a principled approach that supports learning from heterogeneous datasets with different label sets.
Our idea is to utilize a pre-trained language model to embed discrete labels to a continuous latent space with the help of their label names.
Our model outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2023-03-19T06:14:22Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - Cross-Dataset Collaborative Learning for Semantic Segmentation [17.55660581677053]
We present a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL)
Given multiple labeled datasets, we aim to improve the generalization and discrimination of feature representations on each dataset.
We conduct extensive evaluations on four diverse datasets, i.e., Cityscapes, BDD100K, CamVid, and COCO Stuff, with single-dataset and cross-dataset settings.
arXiv Detail & Related papers (2021-03-21T09:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.