Mining Multi-Label Samples from Single Positive Labels
- URL: http://arxiv.org/abs/2206.05764v4
- Date: Sun, 28 May 2023 10:03:44 GMT
- Title: Mining Multi-Label Samples from Single Positive Labels
- Authors: Youngin Cho, Daejin Kim, Mohammad Azam Khan, Jaegul Choo
- Abstract summary: Conditional generative adversarial networks (cGANs) have shown superior results in class-conditional generation tasks.
To simultaneously control multiple conditions, cGANs require multi-label training datasets, where multiple labels can be assigned to each data instance.
We propose a novel sampling approach called single-to-multi-label (S2M) sampling, based on the Markov chain Monte Carlo method.
- Score: 32.10330097419565
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conditional generative adversarial networks (cGANs) have shown superior
results in class-conditional generation tasks. To simultaneously control
multiple conditions, cGANs require multi-label training datasets, where
multiple labels can be assigned to each data instance. Nevertheless, the
tremendous annotation cost limits the accessibility of multi-label datasets in
real-world scenarios. Therefore, in this study we explore the practical setting
called the single positive setting, where each data instance is annotated by
only one positive label with no explicit negative labels. To generate
multi-label data in the single positive setting, we propose a novel sampling
approach called single-to-multi-label (S2M) sampling, based on the Markov chain
Monte Carlo method. As a widely applicable "add-on" method, our proposed S2M
sampling method enables existing unconditional and conditional GANs to draw
high-quality multi-label data with a minimal annotation cost. Extensive
experiments on real image datasets verify the effectiveness and correctness of
our method, even when compared to a model trained with fully annotated
datasets.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - One Positive Label is Sufficient: Single-Positive Multi-Label Learning
with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label.
A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed.
Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z) - Active Learning in Incomplete Label Multiple Instance Multiple Label
Learning [17.5720245903743]
We propose a novel bag-class pair based approach for active learning in the MIML setting.
Our approach is based on a discriminative graphical model with efficient and exact inference.
arXiv Detail & Related papers (2021-07-22T17:01:28Z) - An Empirical Study on Large-Scale Multi-Label Text Classification
Including Few and Zero-Shot Labels [49.036212158261215]
Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications.
Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs)
We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs.
We propose a new state-of-the-art method which combines BERT with LWANs.
arXiv Detail & Related papers (2020-10-04T18:55:47Z) - Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning [54.85397562961903]
Semi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available.
We address a more complex novel scenario named open-set SSL, where out-of-distribution (OOD) samples are contained in unlabeled data.
Our method achieves state-of-the-art results by successfully eliminating the effect of OOD samples.
arXiv Detail & Related papers (2020-07-22T10:33:55Z) - Multi-Label Sampling based on Local Label Imbalance [7.355362369511579]
Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods.
Existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets.
It is actually the imbalance level within the local neighbourhood of minority class examples that plays a key role in performance degradation.
arXiv Detail & Related papers (2020-05-07T04:14:23Z) - Generalized Label Enhancement with Sample Correlations [24.582764493585362]
We propose two novel label enhancement methods, i.e., Label Enhancement with Sample Correlations (LESC) and generalized Label Enhancement with Sample Correlations (gLESC)
Benefitting from the sample correlations, the proposed methods can boost the performance of label enhancement.
arXiv Detail & Related papers (2020-04-07T03:32:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.