A Study on the Autoregressive and non-Autoregressive Multi-label
Learning
- URL: http://arxiv.org/abs/2012.01711v1
- Date: Thu, 3 Dec 2020 05:41:44 GMT
- Title: A Study on the Autoregressive and non-Autoregressive Multi-label
Learning
- Authors: Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung
- Abstract summary: We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
- Score: 77.11075863067131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extreme classification tasks are multi-label tasks with an extremely large
number of labels (tags). These tasks are hard because the label space is
usually (i) very large, e.g. thousands or millions of labels, (ii) very sparse,
i.e. very few labels apply to each input document, and (iii) highly correlated,
meaning that the existence of one label changes the likelihood of predicting
all other labels. In this work, we propose a self-attention based variational
encoder-model to extract the label-label and label-feature dependencies jointly
and to predict labels for a given input. In more detail, we propose a
non-autoregressive latent variable model and compare it to a strong
autoregressive baseline that predicts a label based on all previously generated
labels. Our model can therefore be used to predict all labels in parallel while
still including both label-label and label-feature dependencies through latent
variables, and compares favourably to the autoregressive baseline. We apply our
models to four standard extreme classification natural language data sets, and
one news videos dataset for automated label detection from a lexicon of
semantic concepts. Experimental results show that although the autoregressive
models, where use a given order of the labels for chain-order label prediction,
work great for the small scale labels or the prediction of the highly ranked
label, but our non-autoregressive model surpasses them by around 2% to 6% when
we need to predict more labels, or the dataset has a larger number of the
labels.
Related papers
- Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Label Dependencies-aware Set Prediction Networks for Multi-label Text Classification [0.0]
We leverage Graph Convolutional Networks and construct an adjacency matrix based on the statistical relations between labels.
We enhance recall ability by applying the Bhattacharyya distance to the output distributions of the set prediction networks.
arXiv Detail & Related papers (2023-04-14T09:31:17Z) - Bridging the Gap between Model Explanations in Partially Annotated
Multi-label Classification [85.76130799062379]
We study how false negative labels affect the model's explanation.
We propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.
arXiv Detail & Related papers (2023-04-04T14:00:59Z) - Pairwise Instance Relation Augmentation for Long-tailed Multi-label Text
Classification [38.66674700075432]
We propose a Pairwise Instance Relation Augmentation Network (PIRAN) to augment tailed-label documents for balancing tail labels and head labels.
PIRAN consistently outperforms the SOTA methods, and dramatically improves the performance of tail labels.
arXiv Detail & Related papers (2022-11-19T12:45:54Z) - Group is better than individual: Exploiting Label Topologies and Label
Relations for Joint Multiple Intent Detection and Slot Filling [39.76268402567324]
We construct a Heterogeneous Label Graph (HLG) containing two kinds of topologies.
Label correlations are leveraged to enhance semantic-label interactions.
We also propose the label-aware inter-dependent decoding mechanism to further exploit the label correlations for decoding.
arXiv Detail & Related papers (2022-10-19T08:21:43Z) - Multi-label Classification with High-rank and High-order Label
Correlations [62.39748565407201]
Previous methods capture the high-order label correlations mainly by transforming the label matrix to a latent label space with low-rank matrix factorization.
We propose a simple yet effective method to depict the high-order label correlations explicitly, and at the same time maintain the high-rank of the label matrix.
Comparative studies over twelve benchmark data sets validate the effectiveness of the proposed algorithm in multi-label classification.
arXiv Detail & Related papers (2022-07-09T05:15:31Z) - Acknowledging the Unknown for Multi-label Learning with Single Positive
Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML)
We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels.
Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.