Towards Imbalanced Large Scale Multi-label Classification with Partially
Annotated Labels
- URL: http://arxiv.org/abs/2308.00166v1
- Date: Mon, 31 Jul 2023 21:50:48 GMT
- Title: Towards Imbalanced Large Scale Multi-label Classification with Partially
Annotated Labels
- Authors: XIn Zhang and Yuqi Song and Fei Zuo and Xiaofeng Wang
- Abstract summary: Multi-label classification is a widely encountered problem in daily life, where an instance can be associated with multiple classes.
In this work, we address the issue of label imbalance and investigate how to train neural networks using partial labels.
- Score: 8.977819892091
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label classification is a widely encountered problem in daily life,
where an instance can be associated with multiple classes. In theory, this is a
supervised learning method that requires a large amount of labeling. However,
annotating data is time-consuming and may be infeasible for huge labeling
spaces. In addition, label imbalance can limit the performance of multi-label
classifiers, especially when some labels are missing. Therefore, it is
meaningful to study how to train neural networks using partial labels. In this
work, we address the issue of label imbalance and investigate how to train
classifiers using partial labels in large labeling spaces. First, we introduce
the pseudo-labeling technique, which allows commonly adopted networks to be
applied in partially labeled settings without the need for additional complex
structures. Then, we propose a novel loss function that leverages statistical
information from existing datasets to effectively alleviate the label imbalance
problem. In addition, we design a dynamic training scheme to reduce the
dimension of the labeling space and further mitigate the imbalance. Finally, we
conduct extensive experiments on some publicly available multi-label datasets
such as COCO, NUS-WIDE, CUB, and Open Images to demonstrate the effectiveness
of the proposed approach. The results show that our approach outperforms
several state-of-the-art methods, and surprisingly, in some partial labeling
settings, our approach even exceeds the methods trained with full labels.
Related papers
- Determined Multi-Label Learning via Similarity-Based Prompt [12.428779617221366]
In multi-label classification, each training instance is associated with multiple class labels simultaneously.
To alleviate this problem, a novel labeling setting termed textitDetermined Multi-Label Learning (DMLL) is proposed.
arXiv Detail & Related papers (2024-03-25T07:08:01Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Bridging the Gap between Model Explanations in Partially Annotated
Multi-label Classification [85.76130799062379]
We study how false negative labels affect the model's explanation.
We propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.
arXiv Detail & Related papers (2023-04-04T14:00:59Z) - Complementary to Multiple Labels: A Correlation-Aware Correction
Approach [65.59584909436259]
We show theoretically how the estimated transition matrix in multi-class CLL could be distorted in multi-labeled cases.
We propose a two-step method to estimate the transition matrix from candidate labels.
arXiv Detail & Related papers (2023-02-25T04:48:48Z) - Learning from Stochastic Labels [8.178975818137937]
Annotating multi-class instances is a crucial task in the field of machine learning.
In this paper, we propose a novel suitable approach to learn from these labels.
arXiv Detail & Related papers (2023-02-01T08:04:27Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - An Effective Approach for Multi-label Classification with Missing Labels [8.470008570115146]
We propose a pseudo-label based approach to reduce the cost of annotation without bringing additional complexity to the classification networks.
By designing a novel loss function, we are able to relax the requirement that each instance must contain at least one positive label.
We show that our method can handle the imbalance between positive labels and negative labels, while still outperforming existing missing-label learning approaches.
arXiv Detail & Related papers (2022-10-24T23:13:57Z) - Acknowledging the Unknown for Multi-label Learning with Single Positive
Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML)
We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels.
Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z) - Multi-Label Learning from Single Positive Labels [37.17676289125165]
Predicting all applicable labels for a given image is known as multi-label classification.
We show that it is possible to approach the performance of fully labeled classifiers despite training with significantly fewer confirmed labels.
arXiv Detail & Related papers (2021-06-17T17:58:04Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.