Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning
- URL: http://arxiv.org/abs/2411.17536v1
- Date: Tue, 26 Nov 2024 15:51:25 GMT
- Title: Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning
- Authors: Hoàng-Ân Lê, Paul Berg, Minh-Tan Pham,
- Abstract summary: Making use of one task's information to train the other would be beneficial for multi-task partially supervised learning.
Box-for-Mask and Mask-for-Box strategies are proposed to distil necessary information from one task annotations to train the other.
- Score: 2.7719338074999538
- License:
- Abstract: Object detection and semantic segmentation are both scene understanding tasks yet they differ in data structure and information level. Object detection requires box coordinates for object instances while semantic segmentation requires pixel-wise class labels. Making use of one task's information to train the other would be beneficial for multi-task partially supervised learning where each training example is annotated only for a single task, having the potential to expand training sets with different-task datasets. This paper studies various weak losses for partially annotated data in combination with existing supervised losses. We propose Box-for-Mask and Mask-for-Box strategies, and their combination BoMBo, to distil necessary information from one task annotations to train the other. Ablation studies and experimental results on VOC and COCO datasets show favorable results for the proposed idea. Source code and data splits can be found at https://github.com/lhoangan/multas.
Related papers
- Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets [2.1178416840822023]
Partial multi-task learning where training examples are annotated for one of the target tasks is a promising idea in remote sensing.
This paper proposes using knowledge distillation to replace the need of ground truths for the alternate task and enhance the performance of such approach.
arXiv Detail & Related papers (2024-05-24T09:48:50Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Data exploitation: multi-task learning of object detection and semantic
segmentation on partially annotated data [4.9914667450658925]
We study the joint learning of object detection and semantic segmentation, the two most popular vision problems.
We propose employing knowledge distillation to leverage joint-task optimization.
arXiv Detail & Related papers (2023-11-07T14:49:54Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - A Simple Framework for Open-Vocabulary Segmentation and Detection [85.21641508535679]
We present OpenSeeD, a simple Open-vocabulary and Detection framework that jointly learns from different segmentation and detection datasets.
We first introduce a pre-trained text encoder to encode all the visual concepts in two tasks and learn a common semantic space for them.
After pre-training, our model exhibits competitive or stronger zero-shot transferability for both segmentation and detection.
arXiv Detail & Related papers (2023-03-14T17:58:34Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - Weakly Supervised Multi-Object Tracking and Segmentation [21.7184457265122]
We introduce the problem of weakly supervised Multi-Object Tracking and, i.e. joint weakly supervised instance segmentation and multi-object tracking.
To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning.
We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7% for cars and pedestrians, respectively.
arXiv Detail & Related papers (2021-01-03T17:06:43Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z) - Task-Adaptive Clustering for Semi-Supervised Few-Shot Classification [23.913195015484696]
Few-shot learning aims to handle previously unseen tasks using only a small amount of new training data.
In preparing (or meta-training) a few-shot learner, however, massive labeled data are necessary.
In this work, we propose a few-shot learner that can work well under the semi-supervised setting where a large portion of training data is unlabeled.
arXiv Detail & Related papers (2020-03-18T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.