Related papers: OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification

OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification

URL: http://arxiv.org/abs/2210.14523v1
Date: Wed, 26 Oct 2022 07:25:18 GMT
Title: OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification
Authors: Jie Cao, Yin Zhang
Abstract summary: Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from a large-scale label collection. We propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. Our model generates predictions in student-forcing scheme and is trained by a loss function based on bipartite matching.
Score: 9.990725102725916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from an extremely large-scale label collection. Recently, some deep learning models have achieved state-of-the-art results in XMTC tasks. These models commonly predict scores for all labels by a fully connected layer as the last layer of the model. However, such models can't predict a relatively complete and variable-length label subset for each document, because they select positive labels relevant to the document by a fixed threshold or take top k labels in descending order of scores. A less popular type of deep learning models called sequence-to-sequence (Seq2Seq) focus on predicting variable-length positive labels in sequence style. However, the labels in XMTC tasks are essentially an unordered set rather than an ordered sequence, the default order of labels restrains Seq2Seq models in training. To address this limitation in Seq2Seq, we propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. Our model generates predictions in student-forcing scheme and is trained by a loss function based on bipartite matching which enables permutation-invariance. Meanwhile, we use the optimal transport distance as a measurement to force the model to focus on the closest labels in semantic label space. Experiments show that OTSeq2Set outperforms other competitive baselines on 4 benchmark datasets. Especially, on the Wikipedia dataset with 31k labels, it outperforms the state-of-the-art Seq2Seq method by 16.34% in micro-F1 score. The code is available at https://github.com/caojie54/OTSeq2Set.

Related papers

Towards Micro-Action Recognition with Limited Annotations: An Asynchronous Pseudo Labeling and Training Approach [35.32024173141412]
We introduce the setting of Semi-Supervised MAR (SSMAR), where only a part of samples are labeled. Traditional Semi-Supervised Learning (SSL) methods tend to overfit on inaccurate pseudo-labels, leading to error accumulation and degraded performance. We propose Asynchronous Pseudo Labeling and Training (APLT), which explicitly separates the pseudo-labeling process from model training.
arXiv Detail & Related papers (2025-04-10T14:22:15Z)
Multi-Label Test-Time Adaptation with Bound Entropy Minimization [12.13471668494843]
We develop Bound Entropy Minimization (BEM) to simultaneously increase the confidence of multiple top predicted labels. Across the MSCOCO, VOC, and NUSWIDE multi-label datasets, our ML--TTA framework exhibits superior performance compared to the latest SOTA methods.
arXiv Detail & Related papers (2025-02-06T04:52:16Z)
Generating Unbiased Pseudo-labels via a Theoretically Guaranteed Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label. In nature, regression also requires unbiased methods to generate high-quality labels. We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z)
Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks. We then tailor the labeling model specifically to the task of entity matching. We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z)
Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models [105.4590533269863]
We propose AutoSeq, a fully automatic prompting method. We adopt natural language prompts on sequence-to-sequence models. Our method reveals the potential of sequence-to-sequence models in few-shot learning.
arXiv Detail & Related papers (2022-09-20T01:35:04Z)
Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set. Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation. We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z)
Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning [6.1538971100140145]
We introduce a novel approach with multi-task learning to enhance label correlation feedback. We propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning.
arXiv Detail & Related papers (2021-06-06T12:26:14Z)
GNN-XML: Graph Neural Networks for Extreme Multi-label Text Classification [23.79498916023468]
Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set. GNN-XML is a scalable graph neural network framework tailored for XMTC problems.
arXiv Detail & Related papers (2020-12-10T18:18:34Z)
A Study on the Autoregressive and non-Autoregressive Multi-label Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly. Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z)
Semantic Label Smoothing for Sequence to Sequence Problems [54.758974840974425]
We propose a technique that smooths over emphwell formed relevant sequences that have sufficient n-gram overlap with the target sequence. Our method shows a consistent and significant improvement over the state-of-the-art techniques on different datasets.
arXiv Detail & Related papers (2020-10-15T00:31:15Z)
Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification [24.665469885904145]
We propose a novel deep learning method called APLC-XLNet. Our approach fine-tunes the recently released generalized autoregressive pretrained model (XLNet) to learn a dense representation for the input text. Our experiments, carried out on five benchmark datasets, show that our approach has achieved new state-of-the-art results.
arXiv Detail & Related papers (2020-07-05T20:19:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.