Beyond without Forgetting: Multi-Task Learning for Classification with
Disjoint Datasets
- URL: http://arxiv.org/abs/2003.06746v1
- Date: Sun, 15 Mar 2020 03:19:18 GMT
- Title: Beyond without Forgetting: Multi-Task Learning for Classification with
Disjoint Datasets
- Authors: Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang
- Abstract summary: Multi-task Learning (MTL) for classification with disjoint datasets aims to explore MTL when one task only has one labeled dataset.
Inspired by semi-supervised learning, we use unlabeled datasets with pseudo labels to facilitate each task.
We propose our MTL with Selective Augmentation (MTL-SA) method to select the training samples in unlabeled datasets with confident pseudo labels and close data distribution to the labeled dataset.
- Score: 27.570773346794613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task Learning (MTL) for classification with disjoint datasets aims to
explore MTL when one task only has one labeled dataset. In existing methods,
for each task, the unlabeled datasets are not fully exploited to facilitate
this task. Inspired by semi-supervised learning, we use unlabeled datasets with
pseudo labels to facilitate each task. However, there are two major issues: 1)
the pseudo labels are very noisy; 2) the unlabeled datasets and the labeled
dataset for each task has considerable data distribution mismatch. To address
these issues, we propose our MTL with Selective Augmentation (MTL-SA) method to
select the training samples in unlabeled datasets with confident pseudo labels
and close data distribution to the labeled dataset. Then, we use the selected
training samples to add information and use the remaining training samples to
preserve information. Extensive experiments on face-centric and human-centric
applications demonstrate the effectiveness of our MTL-SA method.
Related papers
- FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
for Semi-Supervised Learning [73.13448439554497]
Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data.
Most SSL methods are commonly based on instance-wise consistency between different data transformations.
We propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets.
arXiv Detail & Related papers (2023-10-25T06:57:59Z) - Self-refining of Pseudo Labels for Music Source Separation with Noisy
Labeled Data [15.275949700129797]
Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks.
This paper introduces an automated technique for refining the labels in a partially mislabeled dataset.
Our proposed self-refining technique, employed with a noisy-labeled dataset, results in only a 1% accuracy degradation in multi-label instrument recognition.
arXiv Detail & Related papers (2023-07-24T07:47:21Z) - Knowledge Assembly: Semi-Supervised Multi-Task Learning from Multiple
Datasets with Disjoint Labels [8.816979799419107]
Multi-Task Learning (MTL) is an adequate method to do so, but usually requires datasets labeled for all tasks.
We propose a method that can leverage datasets labeled for only some of the tasks in the MTL framework.
Our work, Knowledge Assembly (KA), learns multiple tasks from disjoint datasets by leveraging the unlabeled data in a semi-supervised manner.
arXiv Detail & Related papers (2023-06-15T04:05:03Z) - AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data [6.633920993895286]
We show that state-of-the-art SSL algorithms suffer a degradation in performance in the presence of unlabeled auxiliary data.
We propose AuxMix, an algorithm that leverages self-supervised learning tasks to learn generic features in order to mask auxiliary data that are not semantically similar to the labeled set.
arXiv Detail & Related papers (2022-06-14T16:25:20Z) - Mining Multi-Label Samples from Single Positive Labels [32.10330097419565]
Conditional generative adversarial networks (cGANs) have shown superior results in class-conditional generation tasks.
To simultaneously control multiple conditions, cGANs require multi-label training datasets, where multiple labels can be assigned to each data instance.
We propose a novel sampling approach called single-to-multi-label (S2M) sampling, based on the Markov chain Monte Carlo method.
arXiv Detail & Related papers (2022-06-12T15:14:29Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled
Images as Reference [90.5402652758316]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
It uses labeled information to guide the learning of unlabeled instances.
It achieves competitive segmentation accuracy and significantly improves the mIoU by +7$%$ compared to previous approaches.
arXiv Detail & Related papers (2021-12-28T06:48:03Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - Unsupervised Selective Labeling for More Effective Semi-Supervised
Learning [46.414510522978425]
unsupervised selective labeling consistently improves SSL methods over state-of-the-art active learning given labeled data.
Our work sets a new standard for practical and efficient SSL.
arXiv Detail & Related papers (2021-10-06T18:25:50Z) - GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as
Reference [153.354332374204]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
We first introduce a feature alignment objective between labeled and unlabeled data to capture potentially similar image pairs.
MITrans is shown to be a powerful knowledge module for further progressive refining features of unlabeled data.
Along with supervised learning for labeled data, the prediction of unlabeled data is jointly learned with the generated pseudo masks.
arXiv Detail & Related papers (2021-06-29T02:48:45Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.