OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for
Extreme Multi-label Text Classification
- URL: http://arxiv.org/abs/2210.14523v1
- Date: Wed, 26 Oct 2022 07:25:18 GMT
- Title: OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for
Extreme Multi-label Text Classification
- Authors: Jie Cao, Yin Zhang
- Abstract summary: Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from a large-scale label collection.
We propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set.
Our model generates predictions in student-forcing scheme and is trained by a loss function based on bipartite matching.
- Score: 9.990725102725916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extreme multi-label text classification (XMTC) is the task of finding the
most relevant subset labels from an extremely large-scale label collection.
Recently, some deep learning models have achieved state-of-the-art results in
XMTC tasks. These models commonly predict scores for all labels by a fully
connected layer as the last layer of the model. However, such models can't
predict a relatively complete and variable-length label subset for each
document, because they select positive labels relevant to the document by a
fixed threshold or take top k labels in descending order of scores. A less
popular type of deep learning models called sequence-to-sequence (Seq2Seq)
focus on predicting variable-length positive labels in sequence style. However,
the labels in XMTC tasks are essentially an unordered set rather than an
ordered sequence, the default order of labels restrains Seq2Seq models in
training. To address this limitation in Seq2Seq, we propose an autoregressive
sequence-to-set model for XMTC tasks named OTSeq2Set. Our model generates
predictions in student-forcing scheme and is trained by a loss function based
on bipartite matching which enables permutation-invariance. Meanwhile, we use
the optimal transport distance as a measurement to force the model to focus on
the closest labels in semantic label space. Experiments show that OTSeq2Set
outperforms other competitive baselines on 4 benchmark datasets. Especially, on
the Wikipedia dataset with 31k labels, it outperforms the state-of-the-art
Seq2Seq method by 16.34% in micro-F1 score. The code is available at
https://github.com/caojie54/OTSeq2Set.
Related papers
- Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - Automatic Label Sequence Generation for Prompting Sequence-to-sequence
Models [105.4590533269863]
We propose AutoSeq, a fully automatic prompting method.
We adopt natural language prompts on sequence-to-sequence models.
Our method reveals the potential of sequence-to-sequence models in few-shot learning.
arXiv Detail & Related papers (2022-09-20T01:35:04Z) - Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set.
Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation.
We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z) - Enhancing Label Correlation Feedback in Multi-Label Text Classification
via Multi-Task Learning [6.1538971100140145]
We introduce a novel approach with multi-task learning to enhance label correlation feedback.
We propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning.
arXiv Detail & Related papers (2021-06-06T12:26:14Z) - GNN-XML: Graph Neural Networks for Extreme Multi-label Text
Classification [23.79498916023468]
Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set.
GNN-XML is a scalable graph neural network framework tailored for XMTC problems.
arXiv Detail & Related papers (2020-12-10T18:18:34Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z) - Semantic Label Smoothing for Sequence to Sequence Problems [54.758974840974425]
We propose a technique that smooths over emphwell formed relevant sequences that have sufficient n-gram overlap with the target sequence.
Our method shows a consistent and significant improvement over the state-of-the-art techniques on different datasets.
arXiv Detail & Related papers (2020-10-15T00:31:15Z) - Pretrained Generalized Autoregressive Model with Adaptive Probabilistic
Label Clusters for Extreme Multi-label Text Classification [24.665469885904145]
We propose a novel deep learning method called APLC-XLNet.
Our approach fine-tunes the recently released generalized autoregressive pretrained model (XLNet) to learn a dense representation for the input text.
Our experiments, carried out on five benchmark datasets, show that our approach has achieved new state-of-the-art results.
arXiv Detail & Related papers (2020-07-05T20:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.